Yifeng JiangVector Database and StorageIs it true generative AI and RAG increase data storage by up to 10x?May 30May 30
Yifeng JiangGenerative AI, RAG and Data InfrastructureA practical introduction to Generative AI, RAG and their data infrastructureApr 15Apr 15
Yifeng JiangBenchmarking Storage for AI WorkloadsChoose the right storage for your AI infrastructureJan 19Jan 19
Yifeng JiangData and AI Skills, Better TogetherView from a “Data Scientist” at a Storage CompanyDec 5, 2023Dec 5, 2023
Yifeng JiangMake Petabytes Searchable — Elasticsearch Data Tiering Made Simple and FastElastic searchable snapshots with fast S3 object storageMar 3, 20231Mar 3, 20231
Yifeng JiangAccelerating Apache Spark with RAPIDS on GPUGetting started, and benchmarking Spark RAPIDS on Kubernetes and fast S3Feb 13, 2023Feb 13, 2023
Yifeng Jiang2022 in Big Data and Machine LearningA review from a field data and machine learning architectDec 30, 2022Dec 30, 2022
Yifeng JiangSmaller is Better — Big Data System in 2023Consolidating and accelerating big data with fast S3, Kubernetes and Spark RAPIDSNov 28, 2022Nov 28, 2022
Yifeng JiangBuild an Open Data Lakehouse with Spark, Delta and Trino on S3Combining the strength of data lake and warehouse in a way that is open, simple, and runs anywhereNov 7, 20222Nov 7, 20222
Yifeng JiangComparing Big Data Performance with Different Data Lake StoragesBig data benchmarks using TPC-DS and YCSB with HDFS, FlashBlade S3, and Amazon S3Jun 16, 2022Jun 16, 2022