The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
Release Notes
Compare to 7.2 (including features from 8.0b1 and 8.0b2)
- Support for Latest Dependencies
- Compatible with the latest
protobufpython package which improves serialization latency. - Support
torch 2.4.0,numpy 2.0,scikit-learn 1.5.
- Compatible with the latest
- Support stateful Core ML models
- Updates to the converter to produce Core ML models with the State Type (new type introduced in iOS18/macOS15).
- Adds a toy stateful attention example model to show how to use in-place kv-cache.
- Increase conversion support coverage for models produced by
torch.export- Op translation support is at 56% parity with our mature
torch.jit.traceconverter - Representative deep learning models (mobilebert, deeplab, edsr, mobilenet, vit, inception, resnet, wav2letter, emformer) have been supported
- Representative foundation models (llama, stable diffusion) have been supported
- The model quantized by
ct.optimize.torchcould be exported bytorch.exportand then convert.
- Op translation support is at 56% parity with our mature
- New Compression Features
- coremltools.optimize
- Support compression with more granularities: blockwise quantization, grouped channel wise palettization
- 4 bit weight quantization and 3 bit palettization
- Support joint compression modes (8 bit look-up-tables for palettization, pruning+quantization/palettization)
- Vector palettization by setting
cluster_dim > 1and palettization with per channel scale by settingenable_per_channel_scale=True. - Experimental activation quantization (take a W16A16 Core ML model and produce a W8A8 model)
- API updates for
coremltools.optimize.coremlandcoremltools.optimize.torch
- Support some models quantized by
torchao(including the ops produced by torchao such as_weight_int4pack_mm). - Support more ops in
quantized_decomposednamespace, such asembedding_4bit, etc.
- coremltools.optimize
- Support new ops and fixes bugs for old ops
- compression related ops:
constexpr_blockwise_shift_scale,constexpr_lut_to_dense,constexpr_sparse_to_dense, etc - updates to the GRU op
- SDPA op
scaled_dot_product_attention clipop
- compression related ops:
- Updated the model loading API
- Support
optimizationHints. - Support loading specific functions for prediction.
- Support
- New utilities in
coremltools.utilscoremltools.utils.MultiFunctionDescriptorandcoremltools.utils.save_multifunction, for creating an mlprogram with multiple functions in it, that can share weights.coremltools.models.utils.bisect_modelcan break a large Core ML model into two smaller models with similar sizes.coremltools.models.utils.materialize_dynamic_shape_mlmodelcan convert a flexible input shape model into a static input shape model.
- Various other bug fixes, enhancements, clean ups and optimizations
- Special thanks to our external contributors for this release: @sslcandoit @FL33TW00D @dpanshu @timsneath @kasper0406 @lamtrinhdev @valfrom @teelrabbit @igeni @Cyanosite