profile
viewpoint

i-chaochen/age-gender-estimation 0

Keras implementation of a CNN network for age and gender estimation

i-chaochen/AI-System 0

System for AI Education Resource.

i-chaochen/apollo 0

An open autonomous driving platform

i-chaochen/Awesome-GPU 0

Awesome resources for GPUs

i-chaochen/awesome-modern-cpp 0

A collection of resources on modern C++

i-chaochen/awesome-tensor-compilers 0

A list of awesome compiler projects and papers for tensor computation and deep learning.

i-chaochen/DeepLearningSystem 0

Deep Learning System core principles introduction.

i-chaochen/DIEN-tf2 0

Deep Interest Network for Click-Through Rate Prediction / Deep Interest Evolution Network for Click-Through Rate Prediction

i-chaochen/Few-Shot- 0

Few shot learning in NLP

pull request commentROCmSoftwarePlatform/tensorflow-upstream

Small test cleanup of third_party/xla

retest Ubuntu-CPU please

draganmladjenovic

comment created time in 2 days

PullRequestReviewEvent

pull request commentROCmSoftwarePlatform/tensorflow-upstream

Small test cleanup of third_party/xla

does this PR can fix dot_dimension_sorter_test on xla side?

It was passing for me, so I assumed it was fixed in the meantime. Will wait if it shows up in the logs.

Great! please also create a PR to upstream into XLA. Thanks for the fix!

draganmladjenovic

comment created time in 2 days

pull request commentROCmSoftwarePlatform/tensorflow-upstream

Small test cleanup of third_party/xla

retest gpu-pycpp please retest cpu-pycpp please retest gpu-non-pip-multi please

draganmladjenovic

comment created time in 2 days

pull request commentROCmSoftwarePlatform/tensorflow-upstream

Small test cleanup of third_party/xla

does this PR can fix dot_dimension_sorter_test on xla side?

draganmladjenovic

comment created time in 2 days

pull request commentopenxla/xla

[ROCm] Unifying hip/cuda blas-lt APIs

Hi @akuegel wonder have you any suggestion on this PR? Thanks!

pemeliya

comment created time in 2 days

pull request commentopenxla/xla

[ROCm] HipBLASLt row major layout support.

Hi @ddunl wonder how's situation in this PR, is it still requiring another reviewer to approval?

wenchenvincent

comment created time in 3 days

pull request commentopenxla/xla

[ROCm] Unifying hip/cuda blas-lt APIs

@ddunl could you find a related people to review this PR, please? Thanks!

pemeliya

comment created time in 4 days

push eventROCmSoftwarePlatform/triton

Chao Chen

commit sha cfb3b2e967299a183897214b91615a3dadc9f517

keep rocm side only

view details

push time in 4 days

create barnchROCmSoftwarePlatform/triton

branch : triton-mlir-xla-softmax

created branch time in 4 days

pull request commentopenxla/xla

[ROCm] fixed rocm kernel link

Thanks @ezhulenev I will recheck it tomorrow when you're finished.

i-chaochen

comment created time in 5 days

pull request commentopenxla/xla

[ROCm] fixed rocm kernel link

Thanks for the notice. Yes we are monitoring it by our daily nightly CI and we will try our best to catch up.

i-chaochen

comment created time in 5 days

pull request commentopenxla/xla

[ROCm] fixed rocm kernel link

Do you know which change caused the error out of curiosity? I know there's been a lot of changes to stream_executor recently and wondering if those are causing issues for you

I think it's something related to https://github.com/openxla/xla/commit/d04387fd4075af9d70858b2910cfa19a28a28fee and this is a follow-up based on https://github.com/openxla/xla/commit/5c405c4a25d7f32036c03995383462d6ce9a5619

it seems rocm gpu executor now requires //xla/stream_executor:kernel but cuda doesn't need it (or maybe it has it in somewhere else)

https://github.com/openxla/xla/pull/5867/files#diff-923ec2612240244a60ae28c17f798a7daeccbab84a8e9de9efd7ba96c311c763R107

i-chaochen

comment created time in 5 days

PullRequestReviewEvent

PR opened openxla/xla

[ROCm] fixed rocm kernel link

Fixed the missing link at rocm gpu executor

+26 -17

0 comment

5 changed files

pr created time in 5 days

create barnchROCmSoftwarePlatform/xla

branch : rocm_build_error

created branch time in 5 days

push eventROCmSoftwarePlatform/triton

Christian Sigg

commit sha b661030287ebc3fcc2832bd53499e60706a0a924

[BACKEND] Fixes for latest llvm changes. To build against llvm/llvm-project@50665511c79f62d97c0d6602e1577b0abe1b982f.

view details

push time in 8 days

push eventROCmSoftwarePlatform/triton

Christian Sigg

commit sha 6b833f1d5397241e30e1b14477663453ddb22b4c

OpenXLA-specific changes. - Disable short pointer. - Fixup expensiveLoadOrStore().

view details

push time in 8 days

PR opened openxla/xla

[ROCm] fixes kernel name in ROCm

ROCm fixes kernel name due to https://github.com/openxla/xla/commit/d04387fd4075af9d70858b2910cfa19a28a28fee

@akuegel @ezhulenev Thanks in advance!

+5 -5

0 comment

1 changed file

pr created time in 8 days

create barnchROCmSoftwarePlatform/xla

branch : fix_rocm_kernelname

created branch time in 8 days

CommitCommentEvent

create barnchROCmSoftwarePlatform/triton

branch : xla-softmax

created branch time in 9 days

CommitCommentEvent

pull request commentopenxla/xla

[ROCm] gpu command buffer for ROCm

Done

i-chaochen

comment created time in 10 days

push eventROCmSoftwarePlatform/xla

A. Unique TensorFlower

commit sha abba1481c40cc7dbdaccbc4bbceb5b38da2b11f4

Initialize creation_pass_id in GPU Compiler This sets the creation pass id and the logical creation pass id to -1 for all ops in the input HLO. -1 is the special value which identifies ops present in the input HLO. By setting the -1 for input ops we will be able to differentiate ops originating from the input from ops generated by optimization passes. PiperOrigin-RevId: 566929969

view details

Oleg Shyshkov

commit sha cd44c588740c677d115ac7fd1b30cc74acc6e4a5

[XLA:GPU] Priority fusion: don't allow partial fusions. Only consider cases when a producer can be fused will all the consumers. Otherwise fusion is not beneficial. PiperOrigin-RevId: 566932055

view details

A. Unique TensorFlower

commit sha cf1f0bf7121b1c7b444e5a9b63deba7453fa1588

Integrate LLVM at llvm/llvm-project@2baf4a06ef06 Updates LLVM usage to match [2baf4a06ef06](https://github.com/llvm/llvm-project/commit/2baf4a06ef06) PiperOrigin-RevId: 566960191

view details

A. Unique TensorFlower

commit sha e6a84715a45d8c9d18acba1da45587c6e7e5f815

Update TFRT dependency to use revision http://github.com/tensorflow/runtime/commit/752d6d83d403986227dffe42beb5014843cf2ddb. PiperOrigin-RevId: 566971528

view details

Reed Wanderman-Milne

commit sha 4b472be5f3ac0d003fa7323a23cb2c90cd07ab29

Fix layout logic in IrArray::Index::Linearize. The logic previously assumed the layout was major-to-minor. In practice, I believe this assumption is correct in all places where Linearize is called, because LayoutNormalization converts most instructions to be major-to-minor. But in general, IrArray supports arbitrary layouts and so we should obey the given layout in IrArray::Index::Linearize as well. I am working on a change which calls IrArray::Index::Linearize in another place, and it requires Linearize to correctly handle non-major-to-minor layouts. Also fix a comment incorrectly stating multidim was major-to-minor. PiperOrigin-RevId: 566971940

view details

A. Unique TensorFlower

commit sha 7c7ff5a0c07bce1cfa3004f2989a04418b6775da

Internal, fix build error. PiperOrigin-RevId: 566975236

view details

A. Unique TensorFlower

commit sha 6ea1dbae84db61b8b1684018b88726dc091ae837

Update Docker image tags and digests in Kokoro and Bazel This is changing the Kokoro job config back to its original state (using the latest-python3.X docker image tags) and is updating the Bazel remote toolchain config to the latest Docker containers. (The ones created here https://github.com/tensorflow/tensorflow/actions/runs/6239795702/job/16938502214 which also the latest tags refer to.) PiperOrigin-RevId: 566975309

view details

Rahul Joshi

commit sha ac171499bdc795a1e5c4e464798095933e7984f2

[NFC] Log NCCL calls for AllToAll PiperOrigin-RevId: 566988443

view details

Peter Hawkins

commit sha c5138a43b771b6ecb779a66c6dd2acb0b25d0604

Rename tsl/cuda/BUILD to tsl/cuda/BUILD.bazel. Unify _stub and _lib targets in tsl/cuda (under an unsuffixed name), and update users to point directly to merged target. This also allows us to remove some .bzl macros. PiperOrigin-RevId: 566996155

view details

Chao Chen

commit sha aca5aa0e853b201c94420f41c22f5b837dd97f76

ROCm adds gpu command buffer

view details

push time in 10 days

Pull request review commentopenxla/xla

[ROCm] gpu command buffer for ROCm

 load(     "//xla/stream_executor:build_defs.bzl",     "stream_executor_friends", )+load(+    "//xla:xla.bzl",+    "xla_cc_test",

Thanks for pointing out. It's removed.

i-chaochen

comment created time in 10 days

PullRequestReviewEvent

push eventROCmSoftwarePlatform/xla

Chao Chen

commit sha c7bb8d7976996b467ee0297ffb854a2d84e31fe7

ROCm adds gpu command buffer

view details

push time in 10 days

push eventi-chaochen/DeepLearningSystem

chenzomi

commit sha 7b8abfb9f2ed948159b82a9ceef499d8bcc89b71

fix #63 issues.

view details

push time in 10 days

create barnchROCmSoftwarePlatform/tensorflow-upstream

branch : rocm_dnn_header_dep

created branch time in 10 days

more