The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.

Download Latest Version v0.14.5 source code.tar.gz (625.4 kB)
Email in envelope

Get an email when there's a new version of Datapipe

Home / v0.13.0-alpha.3
Name Modified Size InfoDownloads / Week
Parent folder
README.md 2023-07-19 975 Bytes
v0.13.0-alpha.3 source code.tar.gz 2023-07-19 604.1 kB
v0.13.0-alpha.3 source code.zip 2023-07-19 648.7 kB
Totals: 3 Items   1.3 MB 0

WIP 0.13.0

Major changes

  • Add datapipe.metastore.TransformMetaTable. Now each transform gets it's own meta table that tracks status of each transformation
  • Generalize BatchTransform and DatatableBatchTransform through BaseBatchTransformStep
  • Add transform_keys to *BatchTransform
  • Move changed idx computation out of DataStore to BaseBatchTransformStep
  • Add column priority to transform meta table, sort work by priority

New features

  • Add step reset-metadata CLI command
  • Add step fill-metadata CLI command that populates transform meta-table with all indices to process
  • Add helm chart for running regular loops in k8s as CronJob
  • Switch from vanilla tqdm to tqdm_loggable for better display in logs
  • Add step run-idx CLI command

  • Executors: datapipe.executor.SingleThreadExecutor, datapipe.executor.ray.RayExecutor

Bugfixes

  • Fix QdrantStore.read_rows when no idx is specified
Source: README.md, updated 2023-07-19