The interactive file manager requires Javascript. Please enable it or use sftp or scp.
You may still browse the files here.
| Name | Modified | Size | Downloads / Week |
|---|---|---|---|
| Parent folder | |||
| RDKit_2010_12_1.win32.py26.zip | 2011-01-14 | 12.2 MB | |
| README | 2011-01-14 | 47.1 kB | |
| RDKit_2010_12_1.tgz | 2011-01-14 | 9.3 MB | |
| Totals: 3 Items | 21.6 MB | 0 | |
****** Release_2010.12.1 *******
(Changes relative to Release_2010.09.1)
!!!!!! IMPORTANT !!!!!!
- Due to changes made to the fingerprinting code, RDKit and layered
fingerprints generated with this release are not compatible with
those from previous releases. For users of the database cartridge:
you will need to re-generate RDKit fingerprint columns and any
indices on molecule tables.
Acknowledgements:
- Eddie Cao, Andrew Dalke, James Davidson, Kirk DeLisle, Peter Gedeck,
TJ O'Donnell, Gianluca Sforna, Nik Stiefl, Riccardo Vianello
Bug Fixes:
- The depiction code no longer crashes with single-atom templates
(issue 3122141)
- Aromatic bonds in the beginning of a SMILES branch are now
correctly parsed (issue 3127883)
- A crash when generating 2d constrained coordinates was fixed (issue
3135833)
- Stereochemistry no longer removed from double bonds in large
rings. (issue 3139534)
- Atom mapping information no longer in reaction products (issue
3140490)
- Smiles parse failure with repeated ring labels and dot disconnects
fixed (issue 3145697)
- a bug causing the molecule drawing code to not use the cairo canvas
when it's installed was fixed
- the SMILES generated for charged, aromatic Se or Te has been fixed
(issue 3152751)
- PropertyMols constructed from pickles and then written to SD files
will now include the properties in the SD file.
- SMILES can now be generated correctly for very large molecules
where more than 50 rings are open at once. (issue 3154028)
New Features:
- All molecular descriptor calculators are now pulled in by the
rdkit.Chem.Descriptors module. So you can do things like:
Descriptors.MolLogP(mol) or Descriptors.fr_amide(mol)
- Atom-map numbers in SMILES are now supported. They can be accessed
as the atomic "molAtomMapNumber" property. (issue 3140494)
- It's now possible to tell the RDKit to generate non-canonical
SMILES via an optional argument to MolToSmiles. This is faster than
generating canonical SMILES, but is primarity intended for
debugging/testing. (issue 3140495)
- The function GenerateDepictionMatching2DStructure() has been added
to the rdkit.Chem.AllChem module to make generating
template-aligned depictions easier.
- Generating FCFP-like fingerprints is now more straightforward via
the useFeatures optional argument to GetMorganFingerprint()
- Extensive changes were made to the layered fingerprinting code to
allow better coverage of queries.
- Functionality for stripping common salts from molecules has been
added in rdkit.Chem.SaltRemover. The salts themselves are defined
in $RDBASE/Data/Salts.txt
- Functionality for recognizing common functional groups has been
added in rdkit.Chem.FunctionalGroups. The functional groups
themselves are defined in
$RDBASE/Data/Functional_Group_Hierarchy.txt
New Database Cartridge Features:
- The cartridge now supports SMARTS queries.
- The functions is_valid_{smiles,smarts}() are now available
(issue 3097359).
- The operator @= is now supported for testing molecule equality.
(issue 3120707)
- The functions featmorgan_fp() and featmorganbv_fp() are now
available for generating FCFP-like fingerprints.
Deprecated modules (to be removed in next release):
- rdkit.Chem.AvailDescriptors : the same functionality is now available
in a more useable manner from rdkit.Chem.Descriptors (see above).
Removed modules:
Other:
- RDKit support has been added to the Knime data mining and reporting
tool. More information is available from the knime.org community
site: http://tech.knime.org/community/rdkit
Thanks to Thorsten, Bernd, Michael, and the rest of the crew at
knime.com for making this possible.
- RPMs to allow easy installation of the RDKit on Fedora/CentOS/RHEL
and similar systems are now available. Thanks to Gianluca Sforna
for doing this work.
- The database cartridge now statically links the RDKit libraries.
This should make installation easier.
- The RDKit fingerprinter now by default sets 2 bits per hashed
subgraph instead of 4. The old behavior can be regained by setting
nBitsPerHash to 4.
****** Release_2010.09.1 *******
(Changes relative to Release_Q22010_1)
!!!!!! IMPORTANT !!!!!!
- Due to changes made to the layered fingerprinting code,
fingerprints generated with this release are not compatible with
fingerprints from earlier releases.
- The default arguments to the Morgan fingerprinting code will yield
fingerprints that are not backwards compatible.
Acknowledgements:
- Andrew Dalke, James Davidson, Paul Emsley, Peter Gedeck,
Uwe Hoffmann, Christian Kramer, Markus Kossner, TJ O'Donnell,
Gianluca Sforna, Nik Stiefl, Riccardo Vianello
Bug Fixes:
- A typo in the parameters for the Crippen clogp calculator was
fixed. (issue 3057201)
- some problems in the layered fingerprinting code were fixed. (issue
3030388)
- a bug in the ring-finding code that could lead to incorrect results
or crashes in large molecules was fixed.
- the Murtagh clustering code should now execute correctly on recent
versions of the MacOS.
- some problems with the cairo canvas were fixed
- a problem with matching non-default isotope SSS queries for molecules
read in from CTABs was fixed (issue 3073163).
- a problem with calculating AMW for molecules with non-default isotopes
was fixed.
New Features:
- a PostgreSQL cartridge for similarity and substructure searching
has been added to the RDKit distribution.
- The Morgan fingerprinting code accepts additional arguments that
control whether or not bond order and chirality are taken into
account. By default chirality is ignored and the bond order is
used. Another change with the MorganFPs is that ring information is
now included by default.
- 2D coordinates can now be generated for chemical reactions.
- The functions IsMoleculeReactantOfReaction and
IsMoleculeProductOfReaction have been added to the C++
interface. From python these are methods of the ChemicalReaction
class:
rxn.IsMoleculeReactant and rxn.IsMoleculeProduct
- The default bond length for depiction can now be changed.
- FCFP-like fingerprints can now be generated with the Morgan
fingerprinting code by starting with feature invariants.
- The close() method has been added to MolWriters.
- Morgan, atom-pair, and topological-torsion fingerprints can now
also be calculated as bit vectors.
- RDKit and layered fingerprints can now be generated using only
linear paths.
- the function findAllPathsOfLengthMtoN() was added
Deprecated modules (to be removed in next release):
Removed modules:
- rdkit/qtGui
- rdkit/RDToDo
- Projects/SDView
Other:
- As of this release a new version numbering scheme is being used:
YYYY.MM.minor. An example, this release was done in Sept. of 2010
so it's v2010.09.1.
- the RDBASE environment variable is no longer required. It will be
used if set, but the code should work without it
- The directory Contrib/M_Kossner contains two new contributions from
Markus Kossner.
- A change was made to the subgraph matching code that speeds up
substructure searches involving repeated recursive queries.
- the deprecated registerQuery argument has been removed from the
substructure matching functions.
- the empty header files AtomProps.h and BondProps.h have been
removed.
- in order to simplify the build process the test databases are now
in svn
- some python functions to calculate descriptors (i.e. pyMolWt,
pyMolLogP, etc.) that have C++ equivalents have been removed to
clean up the interface
- the PIL canvas should no longer generate warnings
- Thanks to the help of Gianluca Sforna and Riccardo Vianello, it is
now much easier to package and distribute the RDKit.
- the bjam-based build system has been removed.
****** Release_Q22010_1 *******
(Changes relative to Release_Q12010_1)
!!!!!! IMPORTANT !!!!!!
- There are a couple of refactoring changes that affect people using
the RDKit from C++. Please look in the Other section below for a list.
- If you are building the RDKit yourself, changes made in this
release require that you use a reasonably up-to-date version of
flex to build it. Please look in the Other section below for more
information.
Acknowledgements:
- Andrew Dalke, James Davidson, Kirk DeLisle, Thomas Heller, Peter Gedeck,
Greg Magoon, Noel O'Boyle, Nik Stiefl,
Bug Fixes:
- The depictor no longer generates NaNs for some molecules on
windows (issue 2995724)
- [X] query features work correctly with chiral atoms. (issue
3000399)
- mols will no longer be deleted by python when atoms/bonds returned
from mol.Get{Atom,Bond}WithIdx() are still active. (issue 3007178)
- a problem with force-field construction for five-coordinate atoms
was fixed. (issue 3009337)
- double bonds to terminal atoms are no longer marked as "any" bonds
when writing mol blocks. (issue 3009756)
- a problem with stereochemistry of double bonds linking rings was
fixed. (issue 3009836)
- a problem with R/S assignment was fixed. (issue 3009911)
- error and warning messages are now properly displayed when cmake
builds are used on windows.
- a canonicalization problem with double bonds incident onto aromatic
rings was fixed. (issue 3018558)
- a problem with embedding fused small ring systems was fixed.
(issue 3019283)
New Features:
- RXN files can now be written. (issue 3011399)
- reaction smarts can now be written.
- v3000 RXN files can now be read. (issue 3009807)
- better support for query information in mol blocks is present.
(issue 2942501)
- Depictions of reactions can now be generated.
- Morgan fingerprints can now be calculated as bit vectors (as
opposed to count vectors.
- the method GetFeatureDefs() has been added to
MolChemicalFeatureFactory
- repeated recursive SMARTS queries in a single SMARTS will now be
recognized and matched much faster.
- the SMILES and SMARTS parsers can now be run safely in
multi-threaded code.
Deprecated modules (to be removed in next release):
- rdkit/qtGui
- Projects/SDView
Removed modules:
- SVD code: External/svdlibc External/svdpackc rdkit/PySVD
- rdkit/Chem/CDXMLWriter.py
Other:
- The large scale changes in the handling of stereochemistry were
made for this release. These should make the code more robust.
- If you are building the RDKit yourself, changes made in this
release require that you use a reasonably up-to-date version of
flex to build it. This is likely to be a problem on Redhat, and
redhat-derived systems. Specifically: if your version of flex is
something like 2.5.4 (as opposed to something like 2.5.33, 2.5.34,
etc.), you will need to get a newer version from
http://flex.sourceforge.net in order to build the RDKit.
- Changes only affecting C++ programmers:
- The code for calculating topological-torsion and atom-pair
fingerprints has been moved from $RDBASE/Code/GraphMol/Descriptors
to $RDBASE/Code/GraphMol/Fingerprints.
- The naming convention for methods of ExplicitBitVect and
SparseBitVect have been changed to make it more consistent with
the rest of the RDKit.
- the bjam-based build system should be considered
deprecated. This is the last release it will be actively
maintained.
****** Release_Q12010_1 *******
(Changes relative to Release_Q42009_1)
Acknowledgements:
- Andrew Dalke, Jean-Marc Nuzillard, Noel O'Boyle, Gianluca Sforna,
Nik Stiefl, Anna Vulpetti
Bug Fixes
- Substantial improvements were made to the SLN parser
- A bad depiction case was fixed. (issue 2948402)
- Hs added to planar carbons are no longer in the same plane as the
other atoms. (issue 2951221)
- Elements early in the periodic table (e.g. Mg, Na, etc.) no longer
have their radical counts incorrectly assigned. (issue 2952255)
- Some improvements were made to the v3k mol file parser. (issue
2952272)
- Double bonds with unspecified stereochemistry are now correctly
flagged when output to mol files. (issue 2963522)
- A segmentation fault that occured when kekulizing modified
molecules has been fixed. (issue 2983794)
New Features
- The MaxMin diversity picker can now be given a seed for the random
number generator to ensure reproducible results.
Other
- the vflib source, which is no longer used, was removed from the
External source tree. It's still available in svn at rev1323 or via
this tarball:
http://rdkit.svn.sourceforge.net/viewvc/rdkit/trunk/External/vflib-2.0.tar.gz?view=tar&pathrev=1323
- the directory Contrib has been added to the RDKit distribution to
house contributions that don't necessarily fit anywhere else. The
first contribution here is a collection of scripts required to
implement local-environment fingerprints contributed by Anna
Vulpetti.
- Some optimization work was done on the molecule initialization code:
reading in molecules is now somewhat faster.
- Some optimization work was done on the RDK and Layered fingerprinting code.
****** Release_Q42009_1 *******
(Changes relative to Release_Q32009_1)
!!!!!! IMPORTANT !!!!!!
- A bug fix in the SMARTS parser has changed the way atom-map
numbers in Reaction SMARTS are parsed.
Earlier versions of the RDKit required that atom maps be
specified at the beginning of a complex atom query:
[CH3:1,NH2]>>[*:1]O
The corrected version only accepts this form:
[CH3,NH2:1]>>[*:1]O
This change may break existing SMARTS patterns.
- A switch to using cmake as the build system instead of bjam has
made the RDKit much easier to build.
Acknowledgements
- Andrew Dalke, Kirk DeLisle, David Hall, Markus Kossner, Adrian
Schreyer, Nikolaus Stiefl, Jeremy Yang
Bug Fixes
- the SMARTS parser now correctly requires tha atom-map numbers be
at the end of a complex atom query.
(issue 1804420)
- a bug in the way SMARTS matches are uniquified has been fixed
(issue 2884178)
New Features
- The new SMARTS atomic query feature "x" (number of ring bonds) is
now supported.
- The proof-of-concept for a SWIG-based wrapper around the RDKit has
been expanded a bit in functionality. Samples are now included for
Java, C#, and Python.
- Information about the current RDKit and boost versions is now
available from C++ (file RDGeneral/versions.h) and Python
(rdBase.rdkitVersion and rdBase.boostVersion)
- The KNN code now supports weighted nearest-neighbors calculations
with a radius cutoff.
Other
- The lapack dependency has been completely removed from the RDKit.
- The supported build system for the RDKit is now cmake
(http://www.cmake.org) instead of bjam. See the file INSTALL for
the new installation instructions. Files for bjam are still
included in the distribution but are deprecated and will be
removed in a future version.
****** Release_Q32009_1 *******
(Changes relative to Release_Q22009_1)
!!!!!! IMPORTANT !!!!!!
- Due to bug fixes in the boost random-number generator, RDK
fingerprints generated with boost 1.40 are not backwards
compatible with those from earlier versions.
Acknowledgements
- Uwe Hoffmann, Nik Stiefl, Greg Magoon, Ari Gold-Parker,
Akihiro Yokota, Kei Taneishi, Riccardo Vianello, Markus Kossner
Bug Fixes
- the canonOrient argument to the depiction code now works
(issue 2821647)
- typo in the depictor 2D embedding code fixed
(issue 2822883)
- single aromatic atoms in chains now (correctly) fail sanitization
(issue 2830244)
- problem with embedding and fused rings fixed
(issue 2835784)
- crash when reading some large molecules fixed
(issue 2840217)
- trailing newline in TemplateExpand.py fixed
(issue 2867325)
- fingerprint incompatibility on 64bit machines fixed
(issue 2875658)
- PropertyMol properties are now written to SD files
(issue 2880943)
New Features
- to the extent possible, reactions now transfer coordinates from
reactant molecules to product molecules (issue 2832951)
Other
- the function DaylightFingerprintMol() has been removed
- the outdated support for Interbase has been removed
- the Compute2DCoords() function in Python now canonicalizes the
orientation of the molecule by default.
- the distance-geometry code should now generate less bad amide
conformations. (issue 2819563)
- the quality of distance-geometry embeddings for substituted- and
fused-ring systems should be better.
****** Release_Q22009_1 *******
(Changes relative to Release_Q12009_2)
Acknowledgements
- Uwe Hoffmann, Marshall Levesque, Armin Widmer
Bug Fixes
- handling of crossed bonds in mol files fixed (issue 2804599)
- serialization bug fixed (issue 2788233)
- pi systems with 2 electrons now flagged as aromatic (issue 2787221)
- Chirality swap on AddHs (issue 2762917)
- core leak in UFFOptimizeMolecule fixed (issue 2757824)
New Features
- cairo support in the mol drawing code (from Uwe Hoffmann) (issue 2720611)
- Tversky and Tanimoto similarities now supported for SparseIntVects
- AllProbeBitsMatch supported for BitVect-BitVect comparisons
- ChemicalReactions support serialization (pickling) (issue 2799770)
- GetAtomPairFingerprint() supports minLength and maxLength arguments
- GetHashedTopologicalTorsionFingerprint() added
- preliminary support added for v3K mol files
- ForwardSDMolSupplier added
- CompressedSDMolSupplier added (not supported on windows)
- UFFHasAllMoleculeParams() added
- substructure searching code now uses an RDKit implementation of
the vf2 algorithm. It's much faster.
- Atom.GetPropNames() and Bond.GetPropNames() now available from
python
- BRICS code now supports FindBRICSBonds() and BreakBRICSBonds()
- atom labels Q, A, and * in CTABs are more correctly supported
(issue 2797708)
- rdkit.Chem.PropertyMol added (issue 2742959)
- support has been added for enabling and disabling logs
(issue 2738020)
Other
- A demo has been added for using the MPI with the RDKit
($RDBASE/Code/Demos/RDKit/MPI).
- Embedding code is now better at handling chiral structures and
should produce results for molecules with atoms that don't have
UFF parameters.
- the UFF code is more robust w.r.t. missing parameters
- GetHashedAtomPairFingerprint() returns SparseIntVect instead of
ExplicitBitVect
- the CTAB parser (used for mol files and SD files) is faster
- extensive changes to the layered fingerprinting code;
fingerprinting queries is now possible
- molecule discriminator code moved into $RDBASE/Code/GraphMol/Subgraphs
- the SDView4 prototype has been expanded
- $RDBASE/Regress has been added to contain regression and
benchmarking data and scripts.
- support for sqlalchemy has been added to $RDBASE/rdkit/Chem/MolDb
- $RDBASE/Projects/DbCLI/SDSearch.py has been removed; use the
CreateDb.py and SearchDb.py scripts in the same directory instead.
- the BRICS code has been refactored
****** Release_Q12009_2 *******
(Changes relative to Release_Q42008_1)
!!!!!! IMPORTANT !!!!!!
- The directory structure of the distribution has been changed in
order to make installation of the RDKit python modules more
straightforward. Specifically the directory $RDBASE/Python has been
renamed to $RDBASE/rdkit and the Python code now expects that
$RDBASE is in your PYTHONPATH. When importing RDKit Python modules,
one should now do: "from rdkit import Chem" instead of "import
Chem". Old code will continue to work if you also add $RDBASE/rdkit
to your PYTHONPATH, but it is strongly suggested that you update
your scripts to reflect the new organization.
- For C++ programmers: There is a non-backwards compatible change in
the way atoms and bonds are stored on molecules. See the *Other*
section for details.
Acknowledgements
- Kirk DeLisle, Noel O'Boyle, Andrew Dalke, Peter Gedeck, Armin Widmer
Bug Fixes
- Incorrect coordinates from mol2 files (issue 2727976)
- Incorrect handling of 0s as ring closure digits (issues 2525792,
and 2690982)
- Incorrect handling of atoms with explicit Hs in reactions (issue 2540021)
- SmilesMolSupplier.GetItemText() crashes (issue 2632960)
- Incorrect handling of dot separations in reaction SMARTS (issue 2690530)
- Bad charge lines in mol blocks for large molecules (issue 2692246)
- Order dependence in AssignAtomChiralTagsFromStructure (issue 2705543)
- Order dependence in the 2D pharmacophore code
- the LayeredFingerprints now handle non-aromatic single ring bonds
between aromatic atoms correctly.
New Features
- BRICS implementation
- Morgan/circular fingerprints implementation
- The 2D pharmacophore code now uses standard RDKit fdef files.
- Atom parity information in CTABs now written and read. If present
on reading, atom parity flags are stored in the atomic property
"molParity".
- An optional "fromAtoms" argument has been added to the atom pairs
and topological torsion fingerprints. If this is provided, only atom
pairs including the specified atoms, or torsions that either start
or end at the specified atoms, will be included in the fingerprint.
- Kekulization is now optional when generating CTABs. Since the MDL
spec suggests that aromatic bonds not be used, this is primarily
intended for debugging purposes.
- the removeStereochemistry() (RemoveStereoChemistry() from Python)
function has been added to remove all stereochemical information
from a molecule.
Other
- The Qt3-based GUI functionality in $RDBASE/rdkit/qtGui and
$RDBASE/Projects/SDView is deprecated. It should still work, but it
will be removed in a future release. Please do not build anything
new on this (very old and creaky) framework.
- The function DaylightFingerprintMol() is now deprecated, use
RDKFingerprintMol() instead.
- For C++ programmers: The ROMol methods getAtomPMap() and
getBondPMap() have been removed. The molecules themselves now support
an operator[]() method that can be used to convert graph iterators
(e.g. ROMol:edge_iterator, ROMol::vertex_iterator,
ROMol::adjacency_iterator) to the corresponding Atoms and Bonds.
New API for looping over an atom's bonds:
... molPtr is a const ROMol * ...
... atomPtr is a const Atom * ...
ROMol::OEDGE_ITER beg,end;
boost::tie(beg,end) = molPtr->getAtomBonds(atomPtr);
while(beg!=end){
const BOND_SPTR bond=(*molPtr)[*beg];
... do something with the Bond ...
++beg;
}
New API for looping over a molecule's atoms:
... mol is an ROMol ...
ROMol::VERTEX_ITER atBegin,atEnd;
boost::tie(atBegin,atEnd) = mol.getVertices();
while(atBegin!=atEnd){
ATOM_SPTR at2=mol[*atBegin];
... do something with the Atom ...
++atBegin;
}
****** Release_Q42008_1 *******
(Changes relative to Release_Q32008_1)
!!!!!! IMPORTANT !!!!!!
- A fix in the handling of stereochemistry in rings means that the
SMILES generated with this release are different from those in
previous releases. Note that the canonicalization algorithm does
not work in cases of pure ring stereochemistry : the SMILES should
be correct, but it is not canonical. Rings containing chiral
centers should be fine.
Acknowledgements:
- Kirk DeLisle, Markus Kossner, Greg Magoon, Nik Stiefl
Bug Fixes
- core leaks in learning code (issue 2152622)
- H-bond acceptor definitions (issue 2183240)
- handling of aromatic dummies (issue 2196817)
- errors in variable quantization (issue 2202974)
- errors in information theory functions on 64 bit machines (issue 2202977)
- kekulization problems (issue 2202977)
- infinite loop in getShortestPaths() for disconnected structures (issue 2219400)
- error in depictor for double bonds with stereochemistry connected
to rings (issue 2303566)
- aromaticity flags not copied to null atoms in reaction products
(issue 2308128)
- aromaticity perception in large molecule hangs (issue 2313979)
- invariant error in canonicalization (issue 2316677)
- mol file parser handling of bogus bond orders (issue 2337369)
- UFF optimization not terminating when atoms are on top of each
other (issue 2378119)
- incorrect valence errors with 4 coordinate B- (issue 2381580)
- incorrect parsing of atom-list queries with high-numbered atoms
(issue 2413431)
- MolOps::mergeQueryHs() crashing with non-query molecules. (issue
2414779)
New Features
- SLN parser (request 2136703).
- Mol2 parser : Corina atom types (request 2136705).
- Building under mingw (request 2292153).
- Null bonds in reaction products are replaced with the corresponding
bond from the reactants (request 2308123).
Other
- a bunch of deprecation warnings from numpy have been cleaned up
(issue 2318431)
- updated documentation
- some optimization work on the fingerprinter
****** Release_Q32008_1 *******
(Changes relative to Release_May2008_1)
Acknowledgements:
- Noel O'Boyle, Igor Filippov, Evgueni Kolossov, Greg Magoon
Bug Fixes
- A memory leak in the ToBase64 and FromBase64 wrapper functions was
fixed.
- The UFF atom typer has been made more permissive: it now will pick
"close" atom types for things it does not recognize. (issue
2094445)
- The handling of molecules containing radicals has been greatly
improved (issues 2091839, 2091890, 2093420)
- Iterative (or secondary, or dependent) chirality is now supported,
see this page for more information:
http://code.google.com/p/rdkit/wiki/IterativeChirality
(issue 1931470)
- Isotope handling has been changed, this allows correct matching of
SMARTS with specified isotopes. (issue 1968930)
- Some problems with the MACCS key definitions have been
fixed. (issue 2027446)
- Molecules with multiple fragments can now be correctly
embedded. (issue 1989539)
- Adding multiple bonds between the same atoms in a molecule now
produces an error. (issue 1993296)
- The chemical reaction code now handles chiral atoms correctly in
when applying reactions with no stereochem information
provided. (issue 2050085)
- A problem with single-atom cores in TemplateExpand.py has been
fixed. (issue 2091304)
- A problem causing bicyclobutane containing molecules to not be
embeddable has been fixed. (issue 2091864)
- The default parameters for embedding are now molecule-size
dependent. This should help with the embedding of large, and
crowded molecules. (issue 2091974)
- The codebase can now be built with boost 1.36. (issue 2071168)
- A problem with serialization of bond directions was fixed.
(issue 2113433)
New Features
- The RDKit can now be built under Darwin (Mac OS/X).
- Tversky similarity can now be calculated. (request 2015633)
- Many of the core datastructures now support equality comparison
(operator==). (request 1997439)
- Chirality information can now be assigned based on the 3D
coordinates of a molecule using
MolOps::assignChiralTypesFrom3D(). (request 1973062)
- MolOps::getMolFrags() can now return a list of split molecules
instead of just a list of atom ids. (request 1992648)
- ROMol::getPropNames() now supports the includePrivate and
includeComputed options. (request 2047386)
Other
- the pointers returned from Base64Encode/Decode are now allocated
using new instead of malloc or calloc. the memory should be
released with delete[].
- the generation of invariants for chirality testing is now quite a
bit faster; this results in faster parsing of molecules.
- The use of C include files instead of their C++ replacements has
been dramatically reduced.
- The new (as of May2008) hashing algorithm for fingerprints is now
the default in the python fingerprinting code
(Chem.Fingerprints.FingerprintMols).
- The functions MolOps::assignAtomChiralCodes() and
MolOps::assignBondStereoCodes() are deprecated. Use
MolOps::assignStereochemistry() instead.
- The RDKit no longer uses the old numeric python library. It now
uses numpy, which is actively supported.
- By default Lapack++ is no longer used. The replacement is the boost
numeric bindings: http://mathema.tician.de/software/boost-bindings.
****** Release_May2008_1 *******
(Changes relative to Release_Jan2008_1)
!!!!!! IMPORTANT !!!!!!
- A fix to the values of the parameters for the Crippen LogP
calculator means that the values calculated with this version are
not backwards compatible. Old values should be recalculated.
- topological fingerprints generated with this version *may* not be
compatible with those from earlier versions. Please read the note
below in the "Other" section.
- Please read the point about dummy atoms in the "New Features"
section. It explains a change that affects backwards compatibility
when dealing with dummy atoms.
Acknowledgements:
- Some of the bugs fixed in this release were found and reported by
Adrian Schreyer, Noel O'Boyle, and Markus Kossner.
Bug Fixes
- A core leak in MolAlign::getAlignmentTransform was fixed (issue
1899787)
- Mol suppliers now reset the EOF flag on their stream after they run
off the end (issue 1904170)
- A problem causing the string "Sc" to not parse correctly in
recursive SMARTS was fixed (issue 1912895)
- Combined recursive smarts queries are now output correctly.
(issue 1914154)
- A bug in the handling of chirality in reactions was fixed (issue
1920627)
- Looping directly over a supplier no longer causes a crash (issue
1928819)
- a core leak in the smiles parser was fixed (issue 1929199)
- Se and Te are now potential aromatic atoms (per the proposed
OpenSmiles standard). (issue 1932365)
- isotope information (and other atomic modifiers) are now correctly
propagated by chemical reactions (issue 1934052)
- triple bonds no longer contribute 2 electrons to the count for
aromaticity (issue 1940646)
- Two bugs connected with square brackets in SMILES were fixed
(issues 1942220 and 1942657)
- atoms with coordination numbers higher than 4 now have tetrahedral
stereochemistry removed (issue 1942656)
- Bond.SetStereo() is no longer exposed to Python (issue 1944575)
- A few typos in the parameter data for the Crippen logp calculator
were fixed. Values calculated with this version should be assumed
to not be backwards compatible with older versions (issue 1950302)
- Isotope queries are now added correctly (if perhaps not optimally)
to SMARTS.
- some drawing-related bugs have been cleared up.
- A bug in Chem.WedgeMolBonds (used in the drawing code) that was
causing incorrect stereochemistry in drawn structures was
fixed. (issue 1965035)
- A bug causing errors or crashes on Windows with [r<n>] queries was
fixed. (issue 1968930)
- A bug in the calculation of TPSA values in molecules that have Hs
in the graph was fixed. (issue 1969745)
New Features
- Support for supplying dummy atoms as "[Du]", "[X]", "[Xa]", etc. is
now considered deprecated. In this release a warning will be
generated for these forms and in the next release the old form will
generate errors. Note that the output of dummy atoms has also
changed: the default output format is now "*", this means that the
canonical SMILES for molecules containing dummies are no longer
compatible with the canonical SMILES from previous releases.
(feature request 186217)
- Atom and bond query information is now serializable; i.e. query
molecules can now be pickled and not lose the query
information. (feature request 1756596)
- Query features from mol files are now fully supported. (feature
request 1756962)
- Conformations now support a dimensionality flag. Dimensionality
information is now read from mol blocks and TDT files. (feature request
1906758)
- Bulk Dice similarity functions have been added for IntSparseIntVect
and LongSparseIntVect (feature request 1936450)
- Exceptions are no longer thrown during molecule parsing. Failure in
molecule parsing is indicated by returning None. Failure to *open* a
file when reading a molecule throws BadFileExceptions (feature
requests 1932875 and 1938303)
- The various similarity functions for BitVects and SparseIntVects
now take an optional returnDistance argument. If this is provided,
the functions return the corresponding distance instead of
similarity.
- Some additional query information from Mol files is now translated
when generating SMARTS. Additional queries now translated:
- number of ring bonds
- unsaturation queries
- atom lists are handled better as well
(feature request 1902466)
- A new algorithm for generating the bits for topological
fingerprints has been added. The new approach is a bit quicker and
more robust than the old, but is not backwards compatible.
Similarity trends are more or less conserved.
- The molecule drawing code in Chem.Draw.MolDrawing has been modified
so that it creates better drawings. A new option for drawing that
uses the aggdraw graphics library has been added.
- The RingInfo class supports two new methods: AtomRings() and
BondRings() that return tuples of tuples with indices of the atoms
or bonds that make up the molecule's rings.
Other
- Changes in the underlying boost random-number generator in version
1.35 of the boost library may have broken backwards compatibility
of 2D fingerprints generated using the old fingerprinter. It is
strongly suggested that you regenerate any stored fingerprints (and
switch to the new fingerprinter if possible). There is an explicit
test for this in $RDBASE/Code/GraphMol/Fingerprints/test1.cpp
- The unofficial and very obsolete version of John Torjo's v1
boost::logging library that was included with the RDKit
distribution is no longer used. The logging library has been
replaced with the much less powerful and flexible approach of just
sending things to stdout or stderr. If and when the logging library
is accepted into Boost, it will be integrated.
- The DbCLI tools (in $RDBASE/Projects/DbCLI) generate topological
fingerprints using both the old and new algorithms (unless the
--noOldFingerprints option is provided). The default search
uses the newer fingerprint.
- The directory $RDBASE/Data/SmartsLib contains a library of sample
SMARTS contributed by Richard Lewis.
****** Release_Jan2008_1 *******
(Changes relative to Release_Aug2007_1)
!!!!!! IMPORTANT !!!!!!
- Bug fixes in the canonicalization algorithm have made it so that
the canonical SMILES from this version are not compatible with
those from older versions of the RDKit.
- Please read the point about dummy atoms in the "New Features"
section. It explains a forthcoming change that will affect
backwards compatibility when dealing with dummy atoms.
- The build system has been completely changed. Makefiles and Visual
Studio project files have been removed. See the "Other" section for
more info.
Acknowledgements:
- Adrian Schreyer uncovered and reported a number of the bugs fixed
in this release.
Bug Fixes
- the Recap code no longer generates illegal fragments for
highly-branched atoms. (issue 1801871)
- the Recap code no longer breaks cyclic bonds to N
(issue 1804418)
- A bug in the kekulization of aromatic nitrogens has been fixed
(issue 1811276)
- Bugs in the Atom Type definitions for polar carbons and positive
nitrogens in BaseFeatures.fdef have been fixed. (issue 1836242)
- A crash in the sanitization of molecules that only have degree 4
atoms has been fixed; it now generates an exception. The underlying
problem with ring-finding in these systems is still present. (issue
1836576)
- Mol files for molecules that have more than 99 atoms or bonds are
no longer incorrectly generated. (issue 1836615)
- Problems with the sping PIL and PDF canvases have been cleared
up. The PIL canvas still generates a lot of warnings, but the
output is correct.
- The query "rN" is now properly interpreted to be "atom whose
smallest ring is of size N" in SMARTS queries. It was previously
interpreted as "atom is in a ring of size N". (issue 1811276)
This change required that the default feature definitions for
aromaticity and lumped hydrophobes be updated.
- The MolSuppliers (SDMolSupplier, TDTMolSupplier, SmilesMolSupplier)
no longer fail when reading the last element. (issue 1874882)
- A memory leak in the constructor of RWMols was fixed.
- A problem causing rapid memory growth with Recap analysis was fixed.
(issue 1880161)
- The Recap reactions are no longer applied to charged Ns or Os
(issue 1881803)
- Charges, H counts, and isotope information can now be set in
reactions. (issue 1882749)
- The stereo codes from double bonds (used for tracking cis/trans)
are now corrected when MolOps::removeHs is called. (issue 1894348)
- Various small code cleanups and edge case fixes were done as a
result of things discovered while getting the VC8 build working.
New Features
- The SparseIntVect class (used by the atom pairs and topological
torsions) is now implemented in C++.
- The package $RDKit/Python/Chem/MolDb has been added to help deal
with molecular databases. (this was intended for the August release
and overlooked)
- The module $RDKit/Python/Chem/FastSDMolSupplier has been added to
provide a fast (at the expense of memory consumption) class for
working with SD files. (this was intended for the August release
and overlooked)
- A new directory $RDKit/Projects has been created to hold things
that don't really fit in the existing directory structure.
- The new project $RDKit/Projects/DbCLI has been added. This contains
command-line scripts for populating molecular database and
searching them using substructure or similarity.
- The code for calculating some descriptors has been moved into C++
in the new module Chem.rdMolDescriptors. The C++ implementation is
considerably faster than the Python one and should be 100%
backwards compatible.
- The MaxMinPicker (in Code/SimDivPickers) supports two new options:
1) the user can provide a set of initial picks and the algorithm
will pick new items that are diverse w.r.t. to those
2) the user can provide a function to calculate the distance matrix
instead of calculating it in advance. This saves the N^2 step of
calculating the distance matrix.
- A new piece of code demo'ing the use of the RDKit to add chemical
functionality to SQLite is in Code/Demos/sqlite. This will
eventually move from Demos into Code/sqlite once some more
functionality has been added and more testing is done.
- The distance geometry embedding code now supports using random
initial coordinates for atoms instead of using the eigenvalues of
the distance matrix. The default behavior is still to use the
eigenvalues of the distance matrix.
- The function Recap.RecapDecompose now takes an optional argument
where the user can specify the minimum size (in number of atoms)
of a legal fragment. (feature request 180196)
- Dummy atoms can be expressed using asterixes, per the Daylight spec.
Dummy atoms are also now legal members of aromatic systems (e.g.
c1cccc*1 is a legal molecule). Support for supplying dummy atoms
as "[Du]", "[X]", "[Xa]", etc. is now considered deprecated. In
the next release a warning will be generated for these forms and
in the release after that the old form will generate errors. Note
that the output of dummy atoms will also change: in the next release
the default output format will be "*".
(feature request 186217)
- A proof of concept for doing a SWIG wrapper of RDKit functionality
has been added in: $RDBASE/Code/Demos/SWIG/java_example. This isn't
even remotely production-quality; it's intended to demonstrate that
the wrapping works and isn't overly difficult.
Other
- The full set of tests is now easier to setup and run on new
machines. (issue 1757265)
- A new build system, using Boost.Build, has been put into place on
both the windows and linux sides. The new system does much better
dependency checking and handles machine-specific stuff a lot
better. The new system has been tested using Visual Studio 2003,
Visual Studio Express 2005, Ubuntu 7.10, and RHEL5.
- The "Getting Started in Python" document has been expanded.
- There's now an epydoc config file for building the python
documentation ($RDBASE/Python/epydoc.config).
****** Release_Aug2007_1 *******
(Changes relative to Release_April2007_1)
Bug Fixes
- operators and SparseIntVects. (issue 1716736)
- the Mol file parser now calculates cis/trans labels for double
bonds where the two ends had the same substituents. (issue 1718794)
- iterator interface to DiscreteValueVects and UniformGrid3D. (issue
1719831)
- improper removal of stereochemistry from ring atoms (issue
1719053)
- stereochemistry specifications and ring bonds. (issue 1725068)
- handling of aromatic bonds in template molecules for chemical
reactions. (issue 1748846)
- handling of unrecognized atom types in the descriptor calculation
code. (issue 1749494)
- ChemicalReactionException now exposed to Python. (issue 1749513)
- some small problems in topological torsions and atom pairs
New Features
- The Atom Pairs and Topological Torsions code can now provide
"explanations" of the codes. See $RDBASE/Python/Chem/AtomPairs for
details.
- The PointND class has been exposed to Python
- The "Butina" clustering algorithm [JCICS 39:747-50 (1999)] is now
available in $RDBase/Python/Ml/Cluster/Butina.py
- A preliminary implementation of the subshape alignment algorithm is
available.
- The free version of MS Visual C++ is now supported.
- There is better support for queries in MDL mol files. (issue 1756962)
Specifically: ring and chain bond queries; the not modifier for
atom lists; R group labels.
- An EditableMol class is now exposed to Python to allow molecules to
be easily edited. (issue 1764162)
- The RingInfo class is now exposed to Python.
- The replaceSidechains and and replaceCore functions have been added
in the ChemTransforms library and are exposed to Python as
Chem.ReplaceSidechains and Chem.ReplaceCore.
- pickle support added to classes: PointND
- atoms and bonds now support the HasQuery() and GetSmarts() methods
from Python.
Other
- Similarity scores can now be calculated from Python in bulk
(i.e. calculating the similarity between one vector and a list of
others). This can be substantially faster than calling the
individual routines multiple times. The relevant functions are
BulkTanimotoSimilarity, BulkDiceSimilarity, etc.
- The calculation of AtomPairs and TopologicalTorsions fingerprints
is now a lot more efficient.
- Optimization of the Dice metric implementation for SparseIntVects
- The Visual Studio build files have been moved to the directories
$RDBASE/Code/Build.VC71 and $RDBASE/Code/Build.VC80. This allows
simultaneous support of both versions of the system and cleans up
the source trees a bit.
- Boost version 1.34 is now supported (testing has been done on 1.34 and 1.34.1).
- Updates to the "Getting Started" documentation.
****** Release_April2007_1 *******
(Changes relative to Release_Jan2007_1)
Bug Fixes
- handing of isotope information in SMILES has been fixed
- "implicit" hydrogens are now added to charged atoms explicitly when
writing SMILES. (issue 1670149)
- the 2D->3D code no longer generates non-planar conjugated 4-rings
(e.g. C1=CC=C1). (issue 1653802)
- removing explicit hydrogens no longer produces incorrect smiles
(issue 1694023)
- bit indices and signature lengths in the AtomPairs code no longer
being calculated incorrectly. *NOTE* this changes the bits that are
set, so if you have existing signatures, they will need to be
regenerated.
- Fixed a bug causing MolSuppliers to generate incorrect length
information when a combination of random access and iterator
interfaces are used. (issue 1702647)
- Fixed a bug leading to incorrect line numbers in error messages
from the SDMolSuppler. (issue 1695221)
New Features
- chemical reactions are now supported
- there is a new entry point into the 2D depictor code,
compute2DCoordsMimicDistMat(), that attempts to generate 2D
depictions that are similar to the structure described by the
distance matrix. There's also a shorthand approach for calling this
to mimic a 3D structure available as:
AllChem.GenerateDepictionMatching3DStructure()
- DiscreteValueVect and UniformGrid3D now support the binary
operators |, &, +, and -.
- a reader/writer for TPL files has been added.
- support has been added for MolCatalogs: hierarchical catalogs that
can store complete molecules.
- the protrude distance metric for shapes has been added
- pickle support added to classes: UniformGrid, DiscreteValueVect,
Point
- added the class DataStructs/SparseIntVect to improve performance
and clarity of the AtomPairs code
Other
- the non-GUI code now supports python2.5; the GUI code may work with
python2.5, but that has not been tested
- the Mol and SD file parsers have been sped up quite a bit.
- the "Crippen" descriptors are now calculated somewhat faster.
- in-code documentation updates
- new documentation for beginners in $RDBASE/Docs/Book
****** Release_Jan2007_1 *******
(Changes relative to Release_Oct2006_1)
Bug Fixes
- zero-atom molecules now trigger an exception
- dummy atoms are no longer labelled 'Xe'
- core leak in the mol file writer fixed
- mol files with multiple charge lines are now correctly parsed
- a workaround was installed to prevent crashes in the regression
tests on Windows when using the newest VC++ v7 series compiler.
(http://sourceforge.net/tracker/index.php?func=detail&aid=1607290&group_id=160139&atid=814650)
- chirality perception (which requires partial sanitization) is no
longer done by the MolFileParser when sanitization is switched
off.
- Two potential memory corruption problems were fixed (rev's 150 and
151).
New Features
- optional use of chirality in substructure searches
- MolWriters can now all take a stream as an argument
- Chiral terms can now be included in the DistanceGeometry
embedding.
Other
- $RDBASE/Code/Demos/RDKit/BinaryIO is a demonstration of using
boost IOStreams and the ROMol pickling mechanism to generate
highly compressed, random-access files of molecules.
- the Point code has been refactored