You can subscribe to this list here.
| 2012 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(7) |
Sep
|
Oct
|
Nov
|
Dec
|
|---|---|---|---|---|---|---|---|---|---|---|---|---|
| 2013 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
(11) |
Jul
(32) |
Aug
|
Sep
|
Oct
|
Nov
|
Dec
(23) |
| 2014 |
Jan
(12) |
Feb
|
Mar
(1) |
Apr
(4) |
May
(17) |
Jun
(14) |
Jul
(3) |
Aug
(26) |
Sep
(100) |
Oct
(42) |
Nov
(15) |
Dec
(6) |
| 2015 |
Jan
(3) |
Feb
|
Mar
(19) |
Apr
(4) |
May
(9) |
Jun
(4) |
Jul
(4) |
Aug
|
Sep
(2) |
Oct
(1) |
Nov
|
Dec
|
| 2016 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(2) |
Sep
|
Oct
|
Nov
(22) |
Dec
(22) |
| 2017 |
Jan
(5) |
Feb
(4) |
Mar
|
Apr
|
May
(1) |
Jun
|
Jul
|
Aug
(6) |
Sep
|
Oct
|
Nov
|
Dec
|
| 2018 |
Jan
|
Feb
|
Mar
|
Apr
|
May
|
Jun
|
Jul
|
Aug
(1) |
Sep
(2) |
Oct
|
Nov
|
Dec
|
| 2019 |
Jan
(1) |
Feb
(4) |
Mar
(1) |
Apr
|
May
(1) |
Jun
|
Jul
(12) |
Aug
(2) |
Sep
|
Oct
(2) |
Nov
(6) |
Dec
(1) |
| 2020 |
Jan
|
Feb
(3) |
Mar
(1) |
Apr
|
May
(6) |
Jun
(4) |
Jul
|
Aug
|
Sep
(1) |
Oct
(1) |
Nov
|
Dec
|
| 2021 |
Jan
|
Feb
(1) |
Mar
(3) |
Apr
|
May
|
Jun
|
Jul
|
Aug
|
Sep
|
Oct
(3) |
Nov
|
Dec
|
| 2022 |
Jan
|
Feb
|
Mar
|
Apr
(5) |
May
(1) |
Jun
|
Jul
(8) |
Aug
(3) |
Sep
|
Oct
(7) |
Nov
|
Dec
|
| S | M | T | W | T | F | S |
|---|---|---|---|---|---|---|
|
|
|
|
1
(3) |
2
(1) |
3
(6) |
4
(2) |
|
5
|
6
(3) |
7
(1) |
8
|
9
(1) |
10
|
11
|
|
12
|
13
|
14
|
15
(2) |
16
(5) |
17
(4) |
18
|
|
19
|
20
|
21
(3) |
22
(11) |
23
|
24
|
25
|
|
26
|
27
|
28
|
29
|
30
|
31
|
|
|
From: Andy S. <and...@gm...> - 2014-10-22 15:16:48
|
Not sure what you mean by C api, I was talking about methods on the various spatial C++ classes to get a data buffer.
If you want to add a Something::getData(std::vector<double>&);
in addition to the buffer method, i.e. Something::getData(int len, double* data);
that would be fine, but I'm just not sure how useful it would be. I don't think I've ever worked with any physics engine or realtime visualization library that ever used std::vector to hold vertex data. For example, take a look at DirectX or OpenGl. In all of them, all vertex data is handled in plain buffers. Even in higher level libraries like VTK, they provide classes to manage data buffers. In realtime physics or graphics systems, you're going to be performing a lot of matrix operations, and these are just not well suited to std::vector.
Also, if you look at sort of the intent of libsbml: a library to read and write data stored in sbml format. You want to use it to read a model and store it in your own internal data structures, which chances are, are not going to be std::vector.
I'm not saying std::vector is bad, in fact I use it *a lot*, just saying that its not really well suited for things such as vertex or connectivity buffers.
On Oct 22, 2014, at 8:38 AM, Weatherby,Gerard wrote:
> Better != only. The C api could remain. In fact the libsbml implementation could be as simple as:
>
> void getData(std::vector<double> & data) {
> data.reserve(getArrayLen( ) );
> getArrayData(&data.front( ));
> }
>
> (I think, haven’t tested it)
>
>
> From: Andy Somogyi [mailto:and...@gm...]
> Sent: Wednesday, October 22, 2014 8:21 AM
> To: The SBML L3 Spatial Processes and Geometries package discussion list
> Subject: Re: [sbml-spatial] API (was Compression)
>
> What if you're not using std vector to store your data? Say your data is stored in an Eigen or Boost matrix? Or even your own data structure.
>
> Every numeric data structure (including std vector) has a way of getting pointer to the data, then you just pass this pointer to whatever func you want to read/write to that data.
>
> On Wednesday, October 22, 2014, Weatherby,Gerard <gwe...@uc...> wrote:
> From a C++ perspective, the better API would be
>
> void getData(std::vector<double> & data)
>
> where the implementation could reserve the necessary space (std::vector<>::reserve( ) ) and then fill the data.
>
> From: Frank T. Bergmann [mailto:fbe...@ca...]
> Sent: Wednesday, October 22, 2014 1:51 AM
> To: 'The SBML L3 Spatial Processes and Geometries package discussion list'
> Subject: Re: [sbml-spatial] Compression
>
> Hello Andy,
>
> The API you suggest:
>
> int len = obj->getArrayLen();
> double* myData = new double[len];
> obj->getArrayData(myData);
>
> is indeed what is currently implemented in libSBML.
>
> Frank
>
> From: Andy Somogyi [mailto:and...@gm...]
> Sent: Tuesday, October 21, 2014 8:00 PM
> To: The SBML L3 Spatial Processes and Geometries package discussion list
> Subject: Re: [sbml-spatial] Compression
>
> On the API side, I'm asking, please, please, please do not introduce a matrix or array class, and especially please don't return array data by value.
>
> What, I think would work the best is having simple methods to access the array data and have it copied into a user provided buffer, something like
>
> int len = obj->getArrayLen();
> double* myData = new double[len];
> obj->getArrayData(myData);
>
> If it were on the return by value, something like
>
> vector<double> data = obj->getArrayData();
>
> this would result in a huge number of memory allocations and data copies that could easily be avoided if the data were just copied once into a user provided buffer.
>
>
> On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote:
>
>
> I will also voice a vote for option #2.
>
> On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman <sam...@ca...> wrote:
> I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3:
>
> 1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different.
> 2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress.
> 3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while.
>
>
> Sam
>
> On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...> wrote:
> Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant.
>
> Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs.
>
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Paul Macklin, Ph.D.
>
> Assistant Professor of Research Medicine
> Center for Applied Molecular Medicine
> Keck School of Medicine
> University of Southern California
> Los Angeles, CA
>
> Founder and Co-Lead of the MultiCellDS Project
> MultiCellDS: http://MultiCellDS.org / @MultiCellDS
>
> email: Pau...@us... / Pau...@Ma...
> web: http://MathCancer.org
> Twitter: @MathCancer
>
> mobile: +1 310-701-5785
> FAX: +1 323-442-2764
>
>
> On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...> wrote:
> OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have:
>
> 1) binary --> string (ftoa) --> compressed string (this is the existing scheme)
> 2) binary --> base64
> 3) binary --> base64 --> compressed string
>
> Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step.
>
> The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3).
>
> Anyone have strong opinions either way? Is this worth an actual poll of the community?
>
> -Lucian
>
> On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> wrote:
> Interesting!
>
> Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof?
>
> On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote:
> A big part of the slowness comes parsing a string to float, I.e. atof.
>
> Plus atof does not even work the same on different platforms, and different locales throw in another complication.
>
> All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers.
>
> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote:
> Thanks, Andy.
>
> Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk?
>
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Paul Macklin, Ph.D.
>
> Assistant Professor of Research Medicine
> Center for Applied Molecular Medicine
> Keck School of Medicine
> University of Southern California
> Los Angeles, CA
>
> Founder and Co-Lead of the MultiCellDS Project
> MultiCellDS: http://MultiCellDS.org / @MultiCellDS
>
> email: Pau...@us... / Pau...@Ma...
> web: http://MathCancer.org
> Twitter: @MathCancer
>
> mobile: +1 310-701-5785
> FAX: +1 323-442-2764
>
>
> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> wrote:
> Just store the binary array as a base64 encoded blob.
>
> Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data.
>
> In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull.
>
>
> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote:
> It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string).
>
> It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64)
>
> I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field.
>
> Thanks -- Paul
>
>
> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
> Paul Macklin, Ph.D.
>
> Assistant Professor of Research Medicine
> Center for Applied Molecular Medicine
> Keck School of Medicine
> University of Southern California
> Los Angeles, CA
>
> Founder and Co-Lead of the MultiCellDS Project
> MultiCellDS: http://MultiCellDS.org / @MultiCellDS
>
> email: Pau...@us... / Pau...@Ma...
> web: http://MathCancer.org
> Twitter: @MathCancer
>
> mobile: +1 310-701-5785
> FAX: +1 323-442-2764
>
>
> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...> wrote:
> OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward:
>
> As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first.
>
> The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else.
>
> The big advantage of this system is that it's implemented.
>
> The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with.
>
> So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc.
>
>
> For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet.
>
>
> For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers.
>
> As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability.
>
> Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually.
>
> -Lucian
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
>
>
> --
> Dr. Samuel H. Friedman
> University of Southern California Postdoctoral Scholar - Research Associate
> Center for Applied Molecular Medicine Keck School of Medicine
> Email: sam...@ca... Phone: 323-442-2531
> 2250 Alcazar St Rm 259 Los Angeles, CA 90033
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho
> _______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho_______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
>
> ------------------------------------------------------------------------------
> Comprehensive Server Monitoring with Site24x7.
> Monitor 10 servers for $9/Month.
> Get alerted through email, SMS, voice calls or mobile push notifications.
> Take corrective actions from your mobile device.
> http://p.sf.net/sfu/Zoho_______________________________________________
> sbml-spatial mailing list
> sbm...@li...
> https://lists.sourceforge.net/lists/listinfo/sbml-spatial
|
|
From: Devin S. <de...@cm...> - 2014-10-22 14:52:54
|
Do we need these attributes at all? Doesn't it make more sense to define the property on the molecules since all the shapes define a surface/volume. This way if you have a cell and cytoplasm you don't define the cell membrane twice (once for the cell membrane, once to describe the cytoplasmic volume)? This is already how it's done in SBML. We then just map the particles to the surface or to the volume based on their attributes. It has worked for us so far in doing simulation translations, but if you can think of a case where it won't I'm all ears. -Devin On Wed, Oct 22, 2014 at 10:48 AM, Lucian Smith <luc...@gm...> wrote: > Wait--how is it extended to deal with 2D objects explicitly? There's no > current way to define the surface of a sphere, is there? > > -Lucian > > On Tue, Oct 21, 2014 at 11:30 PM, Frank T. Bergmann <fbe...@ca...> > wrote: > >> I would expect my CSG objects to be always solid (unless I performed an >> intersection, explicitly hollowing the object out). We already extended it >> to deal with 2D elements explicitly, so people already have the option of >> adding surfaces explicitly. With the MixedGeometry, they could also add a >> parametric surface patch into the scene as well, if they wanted. So I do >> not see a reason to complicate the CSG further. >> >> >> >> Frank >> >> >> >> *From:* Lucian Smith [mailto:luc...@gm...] >> *Sent:* Thursday, October 16, 2014 1:47 AM >> *To:* The SBML L3 Spatial Processes and Geometries package discussion >> list >> *Subject:* [sbml-spatial] Solids and surfaces >> >> >> >> Another issue that hasn't been resolved yet: do people want to be able >> to create CSGPrimitive surfaces as well as solids? Presumably, they would >> be used to create 2D compartments within 3D space (such as cell membranes), >> and/or 1D compartments within 2D space. >> >> >> >> If so, I would propose a new attribute on CSGPrimitive that indicated >> whether the shape was to be a "solid" or a "surface". For the 2D >> primitives, will 'solid' and 'surface' suffice, or would we need other >> terms, like 'filled' and 'border'? >> >> >> >> -Lucian >> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Frank T. B. <fbe...@ca...> - 2014-10-22 14:51:59
|
We extended the list of CSG objects to the 2D elements circles, squares and triangles. Something that is commonly not done. Frank From: Lucian Smith [mailto:luc...@gm...] Sent: Wednesday, October 22, 2014 4:48 PM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: Re: [sbml-spatial] Solids and surfaces Wait--how is it extended to deal with 2D objects explicitly? There's no current way to define the surface of a sphere, is there? -Lucian On Tue, Oct 21, 2014 at 11:30 PM, Frank T. Bergmann <fbe...@ca... <mailto:fbe...@ca...> > wrote: I would expect my CSG objects to be always solid (unless I performed an intersection, explicitly hollowing the object out). We already extended it to deal with 2D elements explicitly, so people already have the option of adding surfaces explicitly. With the MixedGeometry, they could also add a parametric surface patch into the scene as well, if they wanted. So I do not see a reason to complicate the CSG further. Frank From: Lucian Smith [mailto:luc...@gm... <mailto:luc...@gm...> ] Sent: Thursday, October 16, 2014 1:47 AM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: [sbml-spatial] Solids and surfaces Another issue that hasn't been resolved yet: do people want to be able to create CSGPrimitive surfaces as well as solids? Presumably, they would be used to create 2D compartments within 3D space (such as cell membranes), and/or 1D compartments within 2D space. If so, I would propose a new attribute on CSGPrimitive that indicated whether the shape was to be a "solid" or a "surface". For the 2D primitives, will 'solid' and 'surface' suffice, or would we need other terms, like 'filled' and 'border'? -Lucian ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial |
|
From: Lucian S. <luc...@gm...> - 2014-10-22 14:48:25
|
Wait--how is it extended to deal with 2D objects explicitly? There's no current way to define the surface of a sphere, is there? -Lucian On Tue, Oct 21, 2014 at 11:30 PM, Frank T. Bergmann <fbe...@ca...> wrote: > I would expect my CSG objects to be always solid (unless I performed an > intersection, explicitly hollowing the object out). We already extended it > to deal with 2D elements explicitly, so people already have the option of > adding surfaces explicitly. With the MixedGeometry, they could also add a > parametric surface patch into the scene as well, if they wanted. So I do > not see a reason to complicate the CSG further. > > > > Frank > > > > *From:* Lucian Smith [mailto:luc...@gm...] > *Sent:* Thursday, October 16, 2014 1:47 AM > *To:* The SBML L3 Spatial Processes and Geometries package discussion list > *Subject:* [sbml-spatial] Solids and surfaces > > > > Another issue that hasn't been resolved yet: do people want to be able to > create CSGPrimitive surfaces as well as solids? Presumably, they would be > used to create 2D compartments within 3D space (such as cell membranes), > and/or 1D compartments within 2D space. > > > > If so, I would propose a new attribute on CSGPrimitive that indicated > whether the shape was to be a "solid" or a "surface". For the 2D > primitives, will 'solid' and 'surface' suffice, or would we need other > terms, like 'filled' and 'border'? > > > > -Lucian > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Weatherby,Gerard <gwe...@uc...> - 2014-10-22 12:38:17
|
Better != only. The C api could remain. In fact the libsbml implementation could be as simple as:
void getData(std::vector<double> & data) {
data.reserve(getArrayLen( ) );
getArrayData(&data.front( ));
}
(I think, haven’t tested it)
From: Andy Somogyi [mailto:and...@gm...]
Sent: Wednesday, October 22, 2014 8:21 AM
To: The SBML L3 Spatial Processes and Geometries package discussion list
Subject: Re: [sbml-spatial] API (was Compression)
What if you're not using std vector to store your data? Say your data is stored in an Eigen or Boost matrix? Or even your own data structure.
Every numeric data structure (including std vector) has a way of getting pointer to the data, then you just pass this pointer to whatever func you want to read/write to that data.
On Wednesday, October 22, 2014, Weatherby,Gerard <gwe...@uc...<mailto:gwe...@uc...>> wrote:
From a C++ perspective, the better API would be
void getData(std::vector<double> & data)
where the implementation could reserve the necessary space (std::vector<>::reserve( ) ) and then fill the data.
From: Frank T. Bergmann [mailto:fbe...@ca...<javascript:_e(%7B%7D,'cvml','fbe...@ca...');>]
Sent: Wednesday, October 22, 2014 1:51 AM
To: 'The SBML L3 Spatial Processes and Geometries package discussion list'
Subject: Re: [sbml-spatial] Compression
Hello Andy,
The API you suggest:
int len = obj->getArrayLen();
double* myData = new double[len];
obj->getArrayData(myData);
is indeed what is currently implemented in libSBML.
Frank
From: Andy Somogyi [mailto:and...@gm...<javascript:_e(%7B%7D,'cvml','and...@gm...');>]
Sent: Tuesday, October 21, 2014 8:00 PM
To: The SBML L3 Spatial Processes and Geometries package discussion list
Subject: Re: [sbml-spatial] Compression
On the API side, I'm asking, please, please, please do not introduce a matrix or array class, and especially please don't return array data by value.
What, I think would work the best is having simple methods to access the array data and have it copied into a user provided buffer, something like
int len = obj->getArrayLen();
double* myData = new double[len];
obj->getArrayData(myData);
If it were on the return by value, something like
vector<double> data = obj->getArrayData();
this would result in a huge number of memory allocations and data copies that could easily be avoided if the data were just copied once into a user provided buffer.
On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote:
I will also voice a vote for option #2.
On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman <sam...@ca...<javascript:_e(%7B%7D,'cvml','sam...@ca...');>> wrote:
I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3:
1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different.
2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress.
3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while.
Sam
On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...<javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote:
Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant.
Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs.
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Paul Macklin, Ph.D.
Assistant Professor of Research Medicine
Center for Applied Molecular Medicine
Keck School of Medicine
University of Southern California
Los Angeles, CA
Founder and Co-Lead of the MultiCellDS Project
MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS>
email: Pau...@us...<javascript:_e(%7B%7D,'cvml','Pau...@us...');> / Pau...@Ma...<javascript:_e(%7B%7D,'cvml','Pau...@Ma...');>
web: http://MathCancer.org<http://mathcancer.org/>
Twitter: @MathCancer<http://www.twitter.com/MathCancer>
mobile: +1 310-701-5785<tel:%2B1%20310-701-5785>
FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764>
On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...<javascript:_e(%7B%7D,'cvml','luc...@gm...');>> wrote:
OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have:
1) binary --> string (ftoa) --> compressed string (this is the existing scheme)
2) binary --> base64
3) binary --> base64 --> compressed string
Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step.
The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3).
Anyone have strong opinions either way? Is this worth an actual poll of the community?
-Lucian
On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...<javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote:
Interesting!
Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof?
On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...<javascript:_e(%7B%7D,'cvml','and...@gm...');>> wrote:
A big part of the slowness comes parsing a string to float, I.e. atof.
Plus atof does not even work the same on different platforms, and different locales throw in another complication.
All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers.
On Wednesday, October 15, 2014, Paul Macklin <pau...@us...<javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote:
Thanks, Andy.
Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk?
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Paul Macklin, Ph.D.
Assistant Professor of Research Medicine
Center for Applied Molecular Medicine
Keck School of Medicine
University of Southern California
Los Angeles, CA
Founder and Co-Lead of the MultiCellDS Project
MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS>
email: Pau...@us...<javascript:_e(%7B%7D,'cvml','Pau...@us...');> / Pau...@Ma...<javascript:_e(%7B%7D,'cvml','Pau...@Ma...');>
web: http://MathCancer.org<http://mathcancer.org/>
Twitter: @MathCancer<http://www.twitter.com/MathCancer>
mobile: +1 310-701-5785<tel:%2B1%20310-701-5785>
FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764>
On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...<javascript:_e(%7B%7D,'cvml','and...@gm...');>> wrote:
Just store the binary array as a base64 encoded blob.
Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data.
In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull.
On Wednesday, October 15, 2014, Paul Macklin <pau...@us...<javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote:
It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string).
It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64)
I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field.
Thanks -- Paul
-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-
Paul Macklin, Ph.D.
Assistant Professor of Research Medicine
Center for Applied Molecular Medicine
Keck School of Medicine
University of Southern California
Los Angeles, CA
Founder and Co-Lead of the MultiCellDS Project
MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS>
email: Pau...@us...<javascript:_e(%7B%7D,'cvml','Pau...@us...');> / Pau...@Ma...<javascript:_e(%7B%7D,'cvml','Pau...@Ma...');>
web: http://MathCancer.org<http://mathcancer.org/>
Twitter: @MathCancer<http://www.twitter.com/MathCancer>
mobile: +1 310-701-5785<tel:%2B1%20310-701-5785>
FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764>
On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...<javascript:_e(%7B%7D,'cvml','luc...@gm...');>> wrote:
OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward:
As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first.
The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else.
The big advantage of this system is that it's implemented.
The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with.
So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc.
For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet.
For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers.
As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability.
Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually.
-Lucian
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
--
Dr. Samuel H. Friedman
University of Southern California Postdoctoral Scholar - Research Associate
Center for Applied Molecular Medicine Keck School of Medicine
Email: sam...@ca...<javascript:_e(%7B%7D,'cvml','sam...@ca...');> Phone: 323-442-2531<tel:323-442-2531>
2250 Alcazar St Rm 259 Los Angeles, CA 90033
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho_______________________________________________
sbml-spatial mailing list
sbm...@li...<javascript:_e(%7B%7D,'cvml','sbm...@li...');>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
|
|
From: Andy S. <and...@gm...> - 2014-10-22 12:20:59
|
What if you're not using std vector to store your data? Say your data is stored in an Eigen or Boost matrix? Or even your own data structure. Every numeric data structure (including std vector) has a way of getting pointer to the data, then you just pass this pointer to whatever func you want to read/write to that data. On Wednesday, October 22, 2014, Weatherby,Gerard <gwe...@uc...> wrote: > From a C++ perspective, the better API would be > > > > void getData(std::vector<double> & data) > > > > where the implementation could reserve the necessary space > (std::vector<>::reserve( ) ) and then fill the data. > > > > *From:* Frank T. Bergmann [mailto:fbe...@ca... > <javascript:_e(%7B%7D,'cvml','fbe...@ca...');>] > *Sent:* Wednesday, October 22, 2014 1:51 AM > *To:* 'The SBML L3 Spatial Processes and Geometries package discussion > list' > *Subject:* Re: [sbml-spatial] Compression > > > > Hello Andy, > > > > The API you suggest: > > > > int len = obj->getArrayLen(); > > double* myData = new double[len]; > > obj->getArrayData(myData); > > > > is indeed what is currently implemented in libSBML. > > > > Frank > > > > *From:* Andy Somogyi [mailto:and...@gm... > <javascript:_e(%7B%7D,'cvml','and...@gm...');>] > *Sent:* Tuesday, October 21, 2014 8:00 PM > *To:* The SBML L3 Spatial Processes and Geometries package discussion list > *Subject:* Re: [sbml-spatial] Compression > > > > On the API side, I'm asking, please, please, please do not introduce a > matrix or array class, and especially please don't return array data by > value. > > > > What, I think would work the best is having simple methods to access the > array data and have it copied into a user provided buffer, something like > > > > int len = obj->getArrayLen(); > > double* myData = new double[len]; > > obj->getArrayData(myData); > > > > If it were on the return by value, something like > > > > vector<double> data = obj->getArrayData(); > > > > this would result in a huge number of memory allocations and data copies > that could easily be avoided if the data were just copied once into a user > provided buffer. > > > > > > On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote: > > > > I will also voice a vote for option #2. > > > > On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman < > sam...@ca... > <javascript:_e(%7B%7D,'cvml','sam...@ca...');>> wrote: > > I agree with what Paul has said. If you're going to do compression, you > want to do it once and not multiple times so I would vote for path #2. > There are three reasons why you really don't want to go down route #3: > > > > 1) Floating point numbers don't compress well generally because they > usually have slightly different numbers and hence don't compress well as > each one is different. > > 2) Compression algorithms tend to work better on larger chunks of data > because they have more data to look at when trying to figure out what to > compress. > > 3) If you go to compress your SBML file after you've inserted your > compressed floating point numbers, you have done a double compression which > is almost never worth your while. > > > > > > Sam > > > > On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us... > <javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote: > > Parsing and postprocessing should be a lot easier and faster if the > compression is within the XML (so the tags are still uncompressed and easy > to parse), rather than enclosing the XML (so you have to decompress the > whole thing prior to parsing and postprocessing / analysis). When the > files are big and you have a lot of them to process, this becomes > significant. > > > > Not that these are any of your 1-3 per se, but you do talk about sticking > the whole thing into a zip file. We're shying away from that and looking > towards HDF and/or XML + base64 because for 3D and multicell work, the > files become pretty big and the wait for the zip/unzip process can be a > pretty significant bottleneck to analyzing simulation outputs. > > > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > Paul Macklin, Ph.D. > > > > Assistant Professor of Research Medicine > > Center for Applied Molecular Medicine > > Keck School of Medicine > > University of Southern California > > Los Angeles, CA > > > > Founder and Co-Lead of the MultiCellDS Project > > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > > > *email*: Pau...@us... > <javascript:_e(%7B%7D,'cvml','Pau...@us...');> / > Pau...@Ma... > <javascript:_e(%7B%7D,'cvml','Pau...@Ma...');> > > *web*: http://MathCancer.org <http://mathcancer.org/> > > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > > > *mobile*: +1 310-701-5785 > > *FAX*: +1 323-442-2764 > > > > > > On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm... > <javascript:_e(%7B%7D,'cvml','luc...@gm...');>> wrote: > > OK, so one of the options can obviously remain 'write the numbers as a > string, store that in the XML' for readability. For compression, we have: > > > > 1) binary --> string (ftoa) --> compressed string (this is the existing > scheme) > > 2) binary --> base64 > > 3) binary --> base64 --> compressed string > > > > Andy reports that base64 encoding of binary data is about 30% more > efficient than string encoding of binary data (ftoa), and also has the > advantage of being faster to process when decoding. Since ftoa results in > a smaller character set (0-9,-,e,spaces), you'd recover some of that > inefficiency if you compared 1) to 3), but probably not all of it. You'd > also still have the slower decoding step. > > > > The disadvantage of 3) over 2) is that the resulting .zip file of the > entire document would be slightly larger for 3) than for 2), so the > question would become: what is the main purpose of encoding the data in > the file this way? If it's 'smaller file size', you'd go with 2), but if > it's 'less of the file I have to scroll through when reading it by hand', > you'd want 3). > > > > Anyone have strong opinions either way? Is this worth an actual poll of > the community? > > > > -Lucian > > > > On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us... > <javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote: > > Interesting! > > Perhaps a big improvement to use ieee and base64 for all numerical fields > and get rid of atof? > > On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm... > <javascript:_e(%7B%7D,'cvml','and...@gm...');>> wrote: > > A big part of the slowness comes parsing a string to float, I.e. atof. > > > > Plus atof does not even work the same on different platforms, and > different locales throw in another complication. > > > > All modern processors use IEE 754 double format, so it's actually a much > more stNdard format than textual formatted numbers. > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us... > <javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote: > > Thanks, Andy. > > > > Out of curiosity, is that slowness from parsing complexity or from the > disk read/write itself? Is it still the same bottleneck if reading/writing > files on a solid state disk or ram disk? > > > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > Paul Macklin, Ph.D. > > > > Assistant Professor of Research Medicine > > Center for Applied Molecular Medicine > > Keck School of Medicine > > University of Southern California > > Los Angeles, CA > > > > Founder and Co-Lead of the MultiCellDS Project > > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > > > *email*: Pau...@us... > <javascript:_e(%7B%7D,'cvml','Pau...@us...');> / > Pau...@Ma... > <javascript:_e(%7B%7D,'cvml','Pau...@Ma...');> > > *web*: http://MathCancer.org <http://mathcancer.org/> > > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > > > *mobile*: +1 310-701-5785 > > *FAX*: +1 323-442-2764 > > > > > > On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm... > <javascript:_e(%7B%7D,'cvml','and...@gm...');>> wrote: > > Just store the binary array as a base64 encoded blob. > > > > Not only will the file size be about 30% the size of converting to > strings, but it is an order of magnitude faster in terms of parsing and > reading the data. > > > > In profiling our simulations, currently the slowest part is reading the > sbml, so anything that would improve performance in this area would be very > usefull. > > > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us... > <javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote: > > It sounds like #1 converts the numbers to strings in a sprintf-like > fashion, and then compresses this string (to another string). > > > > It sounds like #2 would directly compress the numbers (in their native > binary format), then encode the compressed output as text (e.g., via base64) > > > > I was wondering what you thought of a (#1/#2)': encode the > doubles/floats/whatever to text via base64 first, compress this, then store > the resulting text in the data field. > > > > Thanks -- Paul > > > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > > Paul Macklin, Ph.D. > > > > Assistant Professor of Research Medicine > > Center for Applied Molecular Medicine > > Keck School of Medicine > > University of Southern California > > Los Angeles, CA > > > > Founder and Co-Lead of the MultiCellDS Project > > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > > > *email*: Pau...@us... > <javascript:_e(%7B%7D,'cvml','Pau...@us...');> / > Pau...@Ma... > <javascript:_e(%7B%7D,'cvml','Pau...@Ma...');> > > *web*: http://MathCancer.org <http://mathcancer.org/> > > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > > > *mobile*: +1 310-701-5785 > > *FAX*: +1 323-442-2764 > > > > > > On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm... > <javascript:_e(%7B%7D,'cvml','luc...@gm...');>> wrote: > > OK, let me see if I can summarize the issues about compression, and ask > people's opinions moving forward: > > > > As things stand right now, the spec itself is a little vague on how > compression works. This obviously needs to be updated, but we should make > sure we know what we want, first. > > > > The libsbml implementation of compression (and used by Frank and Jim) > works by compressing a *string* of numbers into a format that can be > written into an XML file safely (I still don't know which one, but let's > assume that this, at least, doesn't need to be changed). This is why Frank > is concerned about the delimiter or lack thereof: all spaces, delimiters, > etc. are getting compressed along with everything else. > > > > The big advantage of this system is that it's implemented. > > > > The disadvantage of this system is that it's fairly inefficient, mostly > because encoding a number as a string is inefficient to start with. > > > > So that's option #1: keep things as they are implemented now, with > possible tweaks for delimiters, etc. > > > > > > For option #2, we could compress the arrays of numbers directly, and > encode that compression in the same way in the XML. This would have the > advantage of being more compressed, but has the disadvantage of not being > implemented yet. > > > > > > For option #3, we could ditch compression entirely, and rely instead on > our ability to compress the entire SBML document instead (libsbml has > built-in features that let it read and write to compressed documents). > This would actually result in smaller files if the numbers were all written > out than if those number strings were compressed first a la option #1. > This disadvantage of this system is that it makes the files really big, and > therefore harder to read/debug the parts that *aren't* huge arrays of > numbers. > > > > As far as delimiters go, it seemed to me that the simplest option would be > to allow a ';' delimiter wherever people wanted it, and to remove it for > compression. The order of numbers and their meaning would be precisely > defined in the spec, so that special delimiters (besides the space between > the numbers themselves) were not strictly needed, but could be provided for > readability. > > > > Also, keep in mind that if the size of the file itself is an issue, the > entire file can be compressed, not just these strings of numbers. The > point of compressing the numbers inside the XML file is (I believe) so that > the *rest* of the file is easier to view manually. > > > > -Lucian > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > > -- > > Dr. Samuel H. Friedman > University of Southern California Postdoctoral Scholar - Research > Associate > Center for Applied Molecular Medicine Keck School of Medicine > Email: sam...@ca... > <javascript:_e(%7B%7D,'cvml','sam...@ca...');> Phone: > 323-442-2531 > 2250 Alcazar St Rm 259 Los Angeles, CA 90033 > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho_______________________________________________ > sbml-spatial mailing list > sbm...@li... > <javascript:_e(%7B%7D,'cvml','sbm...@li...');> > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > |
|
From: Weatherby,Gerard <gwe...@uc...> - 2014-10-22 12:10:16
|
>From a C++ perspective, the better API would be void getData(std::vector<double> & data) where the implementation could reserve the necessary space (std::vector<>::reserve( ) ) and then fill the data. From: Frank T. Bergmann [mailto:fbe...@ca...] Sent: Wednesday, October 22, 2014 1:51 AM To: 'The SBML L3 Spatial Processes and Geometries package discussion list' Subject: Re: [sbml-spatial] Compression Hello Andy, The API you suggest: int len = obj->getArrayLen(); double* myData = new double[len]; obj->getArrayData(myData); is indeed what is currently implemented in libSBML. Frank From: Andy Somogyi [mailto:and...@gm...] Sent: Tuesday, October 21, 2014 8:00 PM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: Re: [sbml-spatial] Compression On the API side, I'm asking, please, please, please do not introduce a matrix or array class, and especially please don't return array data by value. What, I think would work the best is having simple methods to access the array data and have it copied into a user provided buffer, something like int len = obj->getArrayLen(); double* myData = new double[len]; obj->getArrayData(myData); If it were on the return by value, something like vector<double> data = obj->getArrayData(); this would result in a huge number of memory allocations and data copies that could easily be avoided if the data were just copied once into a user provided buffer. On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote: I will also voice a vote for option #2. On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman <sam...@ca...<mailto:sam...@ca...>> wrote: I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3: 1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different. 2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress. 3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while. Sam On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...<mailto:pau...@us...>> wrote: Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant. Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS> email: Pau...@us...<mailto:Pau...@us...> / Pau...@Ma...<mailto:Pau...@Ma...> web: http://MathCancer.org<http://mathcancer.org/> Twitter: @MathCancer<http://www.twitter.com/MathCancer> mobile: +1 310-701-5785<tel:%2B1%20310-701-5785> FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764> On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...<mailto:luc...@gm...>> wrote: OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have: 1) binary --> string (ftoa) --> compressed string (this is the existing scheme) 2) binary --> base64 3) binary --> base64 --> compressed string Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step. The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3). Anyone have strong opinions either way? Is this worth an actual poll of the community? -Lucian On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...<mailto:pau...@us...>> wrote: Interesting! Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof? On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...<mailto:and...@gm...>> wrote: A big part of the slowness comes parsing a string to float, I.e. atof. Plus atof does not even work the same on different platforms, and different locales throw in another complication. All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers. On Wednesday, October 15, 2014, Paul Macklin <pau...@us...<mailto:pau...@us...>> wrote: Thanks, Andy. Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS> email: Pau...@us...<mailto:Pau...@us...> / Pau...@Ma...<mailto:Pau...@Ma...> web: http://MathCancer.org<http://mathcancer.org/> Twitter: @MathCancer<http://www.twitter.com/MathCancer> mobile: +1 310-701-5785<tel:%2B1%20310-701-5785> FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...<mailto:and...@gm...>> wrote: Just store the binary array as a base64 encoded blob. Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data. In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull. On Wednesday, October 15, 2014, Paul Macklin <pau...@us...<mailto:pau...@us...>> wrote: It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string). It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64) I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field. Thanks -- Paul -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: http://MultiCellDS.org<http://multicellds.org/> / @MultiCellDS<http://www.twitter.com/MultiCellDS> email: Pau...@us...<mailto:Pau...@us...> / Pau...@Ma...<mailto:Pau...@Ma...> web: http://MathCancer.org<http://mathcancer.org/> Twitter: @MathCancer<http://www.twitter.com/MathCancer> mobile: +1 310-701-5785<tel:%2B1%20310-701-5785> FAX: +1 323-442-2764<tel:%2B1%C2%A0323-442-2764> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...<mailto:luc...@gm...>> wrote: OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward: As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first. The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else. The big advantage of this system is that it's implemented. The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with. So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc. For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet. For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers. As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability. Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually. -Lucian ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial -- Dr. Samuel H. Friedman University of Southern California Postdoctoral Scholar - Research Associate Center for Applied Molecular Medicine Keck School of Medicine Email: sam...@ca...<mailto:sam...@ca...> Phone: 323-442-2531<tel:323-442-2531> 2250 Alcazar St Rm 259 Los Angeles, CA 90033 ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho_______________________________________________ sbml-spatial mailing list sbm...@li...<mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial |
|
From: Frank T. B. <fbe...@ca...> - 2014-10-22 06:31:16
|
I would expect my CSG objects to be always solid (unless I performed an intersection, explicitly hollowing the object out). We already extended it to deal with 2D elements explicitly, so people already have the option of adding surfaces explicitly. With the MixedGeometry, they could also add a parametric surface patch into the scene as well, if they wanted. So I do not see a reason to complicate the CSG further. Frank From: Lucian Smith [mailto:luc...@gm...] Sent: Thursday, October 16, 2014 1:47 AM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: [sbml-spatial] Solids and surfaces Another issue that hasn't been resolved yet: do people want to be able to create CSGPrimitive surfaces as well as solids? Presumably, they would be used to create 2D compartments within 3D space (such as cell membranes), and/or 1D compartments within 2D space. If so, I would propose a new attribute on CSGPrimitive that indicated whether the shape was to be a "solid" or a "surface". For the 2D primitives, will 'solid' and 'surface' suffice, or would we need other terms, like 'filled' and 'border'? -Lucian |
|
From: Frank T. B. <fbe...@ca...> - 2014-10-22 06:26:35
|
I aggree with Devin,
It is the group as a whole that gets scaled in this case, not the individual spheres. If one wanted that, one would simply move the scaling before the union.
This is also what people coming from scene graphs would expect.
Frank
From: Devin Sullivan [mailto:de...@cm...]
Sent: Tuesday, October 21, 2014 7:56 PM
To: The SBML L3 Spatial Processes and Geometries package discussion list
Subject: Re: [sbml-spatial] Scaling composed primitives
Hey Lucian,
This is actually a very interesting point.
I would say that for this case since your csgScale operator is occurring outside of the csgSetOperator which groups the two spheres that you should get a scaling of the joined object itself. Further I think that all scaling should happen relative to the object center(or in this case the center of the joined object). This way you know that your object will end up where you set the translation at least in the case of a single object. To get the behavior that you describe here you would simply set a scaling of 1.5 to each csgPrimitive object rather than outside the csgsetOperator.
At least that's what makes sense to me.
-Devin
On Fri, Oct 17, 2014 at 1:51 PM, Lucian Smith <luc...@gm... <mailto:luc...@gm...> > wrote:
While we're discussing things, I have another unrelated question, which I mentioned in the most recent version of the spec, but which we have not discussed on the list (nor did we discuss it at COMBINE).
When you scale a composed object, it makes a difference whether you scale the composed object, or whether you scale each of the respective primitives.
As an example: imagine you have created two spheres that just touch that collectively define a single CSGNode. You could do this basically as:
union(sphere(), translate(x=1, sphere()))
Or, in XML:
<spatial:csgObject spatial:domainType="EN" spatial:ordinal="0" spatial:id="two_spheres">
<spatial:csgSetOperator spatial:operationType="union" spatial:id="union1">
<spatial:listOfCSGNodes>
<spatial:csgPrimitive primitiveType="sphere"/>
<spatial:csgTranslation spatial:id="translation" spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0">
<spatial:csgPrimitive primitiveType="sphere"/>
</spatial:csgTranslation>
</spatial:listOfCSGNodes>
</spatial:csgRotation>
</spatial:csgObject>
Now suppose you want to scale this by 1.5 in all directions:
<spatial:csgObject spatial:domainType="EN" spatial:ordinal="0" spatial:id="two_spheres">
<spatial:csgScale scaleX="1.5" scaleY="1.5" scaleZ="1.5">
<spatial:csgSetOperator spatial:operationType="union" spatial:id="union1">
<spatial:listOfCSGNodes>
<spatial:csgPrimitive primitiveType="sphere"/>
<spatial:csgTranslation spatial:id="translation" spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0">
<spatial:csgPrimitive primitiveType="sphere"/>
</spatial:csgTranslation>
</spatial:listOfCSGNodes>
</spatial:csgRotation>
</spatial:csgScale>
</spatial:csgObject>
The resulting shape will look different and be in different positions depending on how we define things. First, do you scale the entire two-sphere shape as a whole, or do you scale the individual spheres? If you scale the entire shape, you will end up with two spheres that again just touch. If you scale the individual spheres, both will grow, overlapping each other. The other thing you need to decide is what the center of the scaling is--if you scale the entire shape centered at the shape's 'center of mass' (the point where they touch, in this instance), the two spheres will essentially grow out from that point. If they are scaled from the origin, the first sphere will simply grow, while the one it touches will be pushed outward.
In the spec, I wrote that each individual shape should be individually scaled, which in this case would mean that the spheres would grow and overlap each other. But that was based entirely of my intuition of what might be easier to implement, and not on any actual experience. Have any of you actually done scaling of grouped elements like this? How did you do it? Did you decide base on what was easy to implement, or based on what a user would more likely want, or something else?
-Lucian
------------------------------------------------------------------------------
Comprehensive Server Monitoring with Site24x7.
Monitor 10 servers for $9/Month.
Get alerted through email, SMS, voice calls or mobile push notifications.
Take corrective actions from your mobile device.
http://p.sf.net/sfu/Zoho
_______________________________________________
sbml-spatial mailing list
sbm...@li... <mailto:sbm...@li...>
https://lists.sourceforge.net/lists/listinfo/sbml-spatial
|
|
From: Frank T. B. <fbe...@ca...> - 2014-10-22 05:55:27
|
I guess I’m the only one with a strong preference of the status quo, as it is well tested, currently being used by several tools already. Moreover it can be made perfectly human readable (or reasonably well compressed, without getting into the nitty-grittyness of endian-ness) and does not require any overhead in the parsing. Before you put this to a vote, I suggest that you do construct several examples, and show what it would look like in a final document. That way we can easily evaluate the trade off, between saving a couple of bytes and the effort needed to read and write. Thanks Frank From: Lucian Smith [mailto:luc...@gm...] Sent: Friday, October 17, 2014 6:58 PM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: Re: [sbml-spatial] Compression OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have: 1) binary --> string (ftoa) --> compressed string (this is the existing scheme) 2) binary --> base64 3) binary --> base64 --> compressed string Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step. The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3). Anyone have strong opinions either way? Is this worth an actual poll of the community? -Lucian On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: Interesting! Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof? On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm... <mailto:and...@gm...> > wrote: A big part of the slowness comes parsing a string to float, I.e. atof. Plus atof does not even work the same on different platforms, and different locales throw in another complication. All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers. On Wednesday, October 15, 2014, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: Thanks, Andy. Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: <http://multicellds.org/> http://MultiCellDS.org / <http://www.twitter.com/MultiCellDS> @MultiCellDS email: Pau...@us... <mailto:Pau...@us...> / Pau...@Ma... <mailto:Pau...@Ma...> web: <http://mathcancer.org/> http://MathCancer.org Twitter: <http://www.twitter.com/MathCancer> @MathCancer mobile: +1 310-701-5785 <tel:%2B1%20310-701-5785> FAX: +1 <tel:%2B1%C2%A0323-442-2764> 323-442-2764 On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm... <mailto:and...@gm...> > wrote: Just store the binary array as a base64 encoded blob. Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data. In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull. On Wednesday, October 15, 2014, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string). It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64) I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field. Thanks -- Paul -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: <http://multicellds.org/> http://MultiCellDS.org / <http://www.twitter.com/MultiCellDS> @MultiCellDS email: Pau...@us... <mailto:Pau...@us...> / Pau...@Ma... <mailto:Pau...@Ma...> web: <http://mathcancer.org/> http://MathCancer.org Twitter: <http://www.twitter.com/MathCancer> @MathCancer mobile: +1 310-701-5785 <tel:%2B1%20310-701-5785> FAX: +1 <tel:%2B1%C2%A0323-442-2764> 323-442-2764 On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm... <mailto:luc...@gm...> > wrote: OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward: As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first. The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else. The big advantage of this system is that it's implemented. The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with. So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc. For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet. For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers. As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability. Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually. -Lucian ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ------------------------------------------------------------------------------ Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial |
|
From: Frank T. B. <fbe...@ca...> - 2014-10-22 05:51:17
|
Hello Andy, The API you suggest: int len = obj->getArrayLen(); double* myData = new double[len]; obj->getArrayData(myData); is indeed what is currently implemented in libSBML. Frank From: Andy Somogyi [mailto:and...@gm...] Sent: Tuesday, October 21, 2014 8:00 PM To: The SBML L3 Spatial Processes and Geometries package discussion list Subject: Re: [sbml-spatial] Compression On the API side, I'm asking, please, please, please do not introduce a matrix or array class, and especially please don't return array data by value. What, I think would work the best is having simple methods to access the array data and have it copied into a user provided buffer, something like int len = obj->getArrayLen(); double* myData = new double[len]; obj->getArrayData(myData); If it were on the return by value, something like vector<double> data = obj->getArrayData(); this would result in a huge number of memory allocations and data copies that could easily be avoided if the data were just copied once into a user provided buffer. On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote: I will also voice a vote for option #2. On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman <sam...@ca... <mailto:sam...@ca...> > wrote: I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3: 1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different. 2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress. 3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while. Sam On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant. Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: <http://multicellds.org/> http://MultiCellDS.org / <http://www.twitter.com/MultiCellDS> @MultiCellDS email: <mailto:Pau...@us...> Pau...@us... / <mailto:Pau...@Ma...> Pau...@Ma... web: <http://mathcancer.org/> http://MathCancer.org Twitter: <http://www.twitter.com/MathCancer> @MathCancer mobile: +1 310-701-5785 <tel:%2B1%20310-701-5785> FAX: +1 <tel:%2B1%C2%A0323-442-2764> 323-442-2764 On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm... <mailto:luc...@gm...> > wrote: OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have: 1) binary --> string (ftoa) --> compressed string (this is the existing scheme) 2) binary --> base64 3) binary --> base64 --> compressed string Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step. The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3). Anyone have strong opinions either way? Is this worth an actual poll of the community? -Lucian On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: Interesting! Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof? On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm... <mailto:and...@gm...> > wrote: A big part of the slowness comes parsing a string to float, I.e. atof. Plus atof does not even work the same on different platforms, and different locales throw in another complication. All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers. On Wednesday, October 15, 2014, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: Thanks, Andy. Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: <http://multicellds.org/> http://MultiCellDS.org / <http://www.twitter.com/MultiCellDS> @MultiCellDS email: Pau...@us... <mailto:Pau...@us...> / Pau...@Ma... <mailto:Pau...@Ma...> web: <http://mathcancer.org/> http://MathCancer.org Twitter: <http://www.twitter.com/MathCancer> @MathCancer mobile: +1 310-701-5785 <tel:%2B1%20310-701-5785> FAX: +1 <tel:%2B1%C2%A0323-442-2764> 323-442-2764 On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm... <mailto:and...@gm...> > wrote: Just store the binary array as a base64 encoded blob. Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data. In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull. On Wednesday, October 15, 2014, Paul Macklin <pau...@us... <mailto:pau...@us...> > wrote: It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string). It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64) I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field. Thanks -- Paul -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project MultiCellDS: <http://multicellds.org/> http://MultiCellDS.org / <http://www.twitter.com/MultiCellDS> @MultiCellDS email: Pau...@us... <mailto:Pau...@us...> / Pau...@Ma... <mailto:Pau...@Ma...> web: <http://mathcancer.org/> http://MathCancer.org Twitter: <http://www.twitter.com/MathCancer> @MathCancer mobile: +1 310-701-5785 <tel:%2B1%20310-701-5785> FAX: +1 <tel:%2B1%C2%A0323-442-2764> 323-442-2764 On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm... <mailto:luc...@gm...> > wrote: OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward: As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first. The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else. The big advantage of this system is that it's implemented. The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with. So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc. For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet. For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers. As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability. Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually. -Lucian ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial -- Dr. Samuel H. Friedman University of Southern California Postdoctoral Scholar - Research Associate Center for Applied Molecular Medicine Keck School of Medicine Email: sam...@ca... <mailto:sam...@ca...> Phone: 323-442-2531 <tel:323-442-2531> 2250 Alcazar St Rm 259 Los Angeles, CA 90033 ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho _______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial ---------------------------------------------------------------------------- -- Comprehensive Server Monitoring with Site24x7. Monitor 10 servers for $9/Month. Get alerted through email, SMS, voice calls or mobile push notifications. Take corrective actions from your mobile device. http://p.sf.net/sfu/Zoho_______________________________________________ sbml-spatial mailing list sbm...@li... <mailto:sbm...@li...> https://lists.sourceforge.net/lists/listinfo/sbml-spatial |
|
From: Andy S. <and...@gm...> - 2014-10-21 18:00:17
|
On the API side, I'm asking, please, please, please do not introduce a matrix or array class, and especially please don't return array data by value. What, I think would work the best is having simple methods to access the array data and have it copied into a user provided buffer, something like int len = obj->getArrayLen(); double* myData = new double[len]; obj->getArrayData(myData); If it were on the return by value, something like vector<double> data = obj->getArrayData(); this would result in a huge number of memory allocations and data copies that could easily be avoided if the data were just copied once into a user provided buffer. On Oct 21, 2014, at 1:46 PM, Devin Sullivan wrote: > I will also voice a vote for option #2. > > On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman <sam...@ca...> wrote: > I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3: > > 1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different. > 2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress. > 3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while. > > > Sam > > On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...> wrote: > Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant. > > Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs. > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > MultiCellDS: http://MultiCellDS.org / @MultiCellDS > > email: Pau...@us... / Pau...@Ma... > web: http://MathCancer.org > Twitter: @MathCancer > > mobile: +1 310-701-5785 > FAX: +1 323-442-2764 > > > On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...> wrote: > OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have: > > 1) binary --> string (ftoa) --> compressed string (this is the existing scheme) > 2) binary --> base64 > 3) binary --> base64 --> compressed string > > Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step. > > The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3). > > Anyone have strong opinions either way? Is this worth an actual poll of the community? > > -Lucian > > On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> wrote: > Interesting! > > Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof? > > On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: > A big part of the slowness comes parsing a string to float, I.e. atof. > > Plus atof does not even work the same on different platforms, and different locales throw in another complication. > > All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers. > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote: > Thanks, Andy. > > Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk? > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > MultiCellDS: http://MultiCellDS.org / @MultiCellDS > > email: Pau...@us... / Pau...@Ma... > web: http://MathCancer.org > Twitter: @MathCancer > > mobile: +1 310-701-5785 > FAX: +1 323-442-2764 > > > On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> wrote: > Just store the binary array as a base64 encoded blob. > > Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data. > > In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull. > > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote: > It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string). > > It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64) > > I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field. > > Thanks -- Paul > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > MultiCellDS: http://MultiCellDS.org / @MultiCellDS > > email: Pau...@us... / Pau...@Ma... > web: http://MathCancer.org > Twitter: @MathCancer > > mobile: +1 310-701-5785 > FAX: +1 323-442-2764 > > > On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...> wrote: > OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward: > > As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first. > > The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else. > > The big advantage of this system is that it's implemented. > > The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with. > > So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc. > > > For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet. > > > For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers. > > As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability. > > Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually. > > -Lucian > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > > > -- > Dr. Samuel H. Friedman > University of Southern California Postdoctoral Scholar - Research Associate > Center for Applied Molecular Medicine Keck School of Medicine > Email: sam...@ca... Phone: 323-442-2531 > 2250 Alcazar St Rm 259 Los Angeles, CA 90033 > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho_______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial |
|
From: Devin S. <de...@cm...> - 2014-10-21 17:56:23
|
Hey Lucian, This is actually a very interesting point. I would say that for this case since your csgScale operator is occurring outside of the csgSetOperator which groups the two spheres that you should get a scaling of the joined object itself. Further I think that all scaling should happen relative to the object center(or in this case the center of the joined object). This way you know that your object will end up where you set the translation at least in the case of a single object. To get the behavior that you describe here you would simply set a scaling of 1.5 to each csgPrimitive object rather than outside the csgsetOperator. At least that's what makes sense to me. -Devin On Fri, Oct 17, 2014 at 1:51 PM, Lucian Smith <luc...@gm...> wrote: > While we're discussing things, I have another unrelated question, which I > mentioned in the most recent version of the spec, but which we have not > discussed on the list (nor did we discuss it at COMBINE). > > When you scale a composed object, it makes a difference whether you scale > the composed object, or whether you scale each of the respective primitives. > > As an example: imagine you have created two spheres that just touch that > collectively define a single CSGNode. You could do this basically as: > > union(sphere(), translate(x=1, sphere())) > > Or, in XML: > > <spatial:csgObject spatial:domainType="EN" spatial:ordinal="0" > spatial:id="two_spheres"> > <spatial:csgSetOperator spatial:operationType="union" > spatial:id="union1"> > <spatial:listOfCSGNodes> > <spatial:csgPrimitive primitiveType="sphere"/> > <spatial:csgTranslation spatial:id="translation" > spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0"> > <spatial:csgPrimitive primitiveType="sphere"/> > </spatial:csgTranslation> > </spatial:listOfCSGNodes> > </spatial:csgRotation> > </spatial:csgObject> > > Now suppose you want to scale this by 1.5 in all directions: > > <spatial:csgObject spatial:domainType="EN" spatial:ordinal="0" > spatial:id="two_spheres"> > <spatial:csgScale scaleX="1.5" scaleY="1.5" scaleZ="1.5"> > <spatial:csgSetOperator spatial:operationType="union" > spatial:id="union1"> > <spatial:listOfCSGNodes> > <spatial:csgPrimitive primitiveType="sphere"/> > <spatial:csgTranslation spatial:id="translation" > spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0"> > <spatial:csgPrimitive primitiveType="sphere"/> > </spatial:csgTranslation> > </spatial:listOfCSGNodes> > </spatial:csgRotation> > </spatial:csgScale> > </spatial:csgObject> > > > The resulting shape will look different and be in different positions > depending on how we define things. First, do you scale the entire > two-sphere shape as a whole, or do you scale the individual spheres? If > you scale the entire shape, you will end up with two spheres that again > just touch. If you scale the individual spheres, both will grow, > overlapping each other. The other thing you need to decide is what the > center of the scaling is--if you scale the entire shape centered at the > shape's 'center of mass' (the point where they touch, in this instance), > the two spheres will essentially grow out from that point. If they are > scaled from the origin, the first sphere will simply grow, while the one it > touches will be pushed outward. > > In the spec, I wrote that each individual shape should be individually > scaled, which in this case would mean that the spheres would grow and > overlap each other. But that was based entirely of my intuition of what > might be easier to implement, and not on any actual experience. Have any > of you actually done scaling of grouped elements like this? How did you do > it? Did you decide base on what was easy to implement, or based on what a > user would more likely want, or something else? > > -Lucian > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Devin S. <de...@cm...> - 2014-10-21 17:46:35
|
I will also voice a vote for option #2. On Fri, Oct 17, 2014 at 7:14 PM, Samuel Friedman < sam...@ca...> wrote: > I agree with what Paul has said. If you're going to do compression, you > want to do it once and not multiple times so I would vote for path #2. > There are three reasons why you really don't want to go down route #3: > > 1) Floating point numbers don't compress well generally because they > usually have slightly different numbers and hence don't compress well as > each one is different. > 2) Compression algorithms tend to work better on larger chunks of data > because they have more data to look at when trying to figure out what to > compress. > 3) If you go to compress your SBML file after you've inserted your > compressed floating point numbers, you have done a double compression which > is almost never worth your while. > > > Sam > > On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...> > wrote: > >> Parsing and postprocessing should be a lot easier and faster if the >> compression is within the XML (so the tags are still uncompressed and easy >> to parse), rather than enclosing the XML (so you have to decompress the >> whole thing prior to parsing and postprocessing / analysis). When the >> files are big and you have a lot of them to process, this becomes >> significant. >> >> Not that these are any of your 1-3 per se, but you do talk about sticking >> the whole thing into a zip file. We're shying away from that and looking >> towards HDF and/or XML + base64 because for 3D and multicell work, the >> files become pretty big and the wait for the zip/unzip process can be a >> pretty significant bottleneck to analyzing simulation outputs. >> >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Paul Macklin, Ph.D. >> >> Assistant Professor of Research Medicine >> Center for Applied Molecular Medicine >> Keck School of Medicine >> University of Southern California >> Los Angeles, CA >> >> Founder and Co-Lead of the MultiCellDS Project >> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >> @MultiCellDS <http://www.twitter.com/MultiCellDS> >> >> *email*: Pau...@us... / Pau...@Ma... >> *web*: http://MathCancer.org <http://mathcancer.org/> >> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >> >> *mobile*: +1 310-701-5785 >> *FAX*: +1 323-442-2764 >> >> >> On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...> >> wrote: >> >>> OK, so one of the options can obviously remain 'write the numbers as a >>> string, store that in the XML' for readability. For compression, we have: >>> >>> 1) binary --> string (ftoa) --> compressed string (this is the existing >>> scheme) >>> 2) binary --> base64 >>> 3) binary --> base64 --> compressed string >>> >>> Andy reports that base64 encoding of binary data is about 30% more >>> efficient than string encoding of binary data (ftoa), and also has the >>> advantage of being faster to process when decoding. Since ftoa results in >>> a smaller character set (0-9,-,e,spaces), you'd recover some of that >>> inefficiency if you compared 1) to 3), but probably not all of it. You'd >>> also still have the slower decoding step. >>> >>> The disadvantage of 3) over 2) is that the resulting .zip file of the >>> entire document would be slightly larger for 3) than for 2), so the >>> question would become: what is the main purpose of encoding the data in >>> the file this way? If it's 'smaller file size', you'd go with 2), but if >>> it's 'less of the file I have to scroll through when reading it by hand', >>> you'd want 3). >>> >>> Anyone have strong opinions either way? Is this worth an actual poll of >>> the community? >>> >>> -Lucian >>> >>> On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> >>> wrote: >>> >>>> Interesting! >>>> >>>> Perhaps a big improvement to use ieee and base64 for all numerical >>>> fields and get rid of atof? >>>> On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: >>>> >>>>> A big part of the slowness comes parsing a string to float, I.e. atof. >>>>> >>>>> Plus atof does not even work the same on different platforms, and >>>>> different locales throw in another complication. >>>>> >>>>> All modern processors use IEE 754 double format, so it's actually a >>>>> much more stNdard format than textual formatted numbers. >>>>> >>>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>>> wrote: >>>>> >>>>>> Thanks, Andy. >>>>>> >>>>>> Out of curiosity, is that slowness from parsing complexity or from >>>>>> the disk read/write itself? Is it still the same bottleneck if >>>>>> reading/writing files on a solid state disk or ram disk? >>>>>> >>>>>> >>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>>> Paul Macklin, Ph.D. >>>>>> >>>>>> Assistant Professor of Research Medicine >>>>>> Center for Applied Molecular Medicine >>>>>> Keck School of Medicine >>>>>> University of Southern California >>>>>> Los Angeles, CA >>>>>> >>>>>> Founder and Co-Lead of the MultiCellDS Project >>>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>>> >>>>>> *email*: Pau...@us... / Pau...@Ma... >>>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>>> >>>>>> *mobile*: +1 310-701-5785 >>>>>> *FAX*: +1 323-442-2764 >>>>>> >>>>>> >>>>>> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm... >>>>>> > wrote: >>>>>> >>>>>>> Just store the binary array as a base64 encoded blob. >>>>>>> >>>>>>> Not only will the file size be about 30% the size of converting to >>>>>>> strings, but it is an order of magnitude faster in terms of parsing and >>>>>>> reading the data. >>>>>>> >>>>>>> In profiling our simulations, currently the slowest part is reading >>>>>>> the sbml, so anything that would improve performance in this area would be >>>>>>> very usefull. >>>>>>> >>>>>>> >>>>>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>>>>> wrote: >>>>>>> >>>>>>>> It sounds like #1 converts the numbers to strings in a sprintf-like >>>>>>>> fashion, and then compresses this string (to another string). >>>>>>>> >>>>>>>> It sounds like #2 would directly compress the numbers (in their >>>>>>>> native binary format), then encode the compressed output as text (e.g., via >>>>>>>> base64) >>>>>>>> >>>>>>>> I was wondering what you thought of a (#1/#2)': encode the >>>>>>>> doubles/floats/whatever to text via base64 first, compress this, then store >>>>>>>> the resulting text in the data field. >>>>>>>> >>>>>>>> Thanks -- Paul >>>>>>>> >>>>>>>> >>>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>>>>> Paul Macklin, Ph.D. >>>>>>>> >>>>>>>> Assistant Professor of Research Medicine >>>>>>>> Center for Applied Molecular Medicine >>>>>>>> Keck School of Medicine >>>>>>>> University of Southern California >>>>>>>> Los Angeles, CA >>>>>>>> >>>>>>>> Founder and Co-Lead of the MultiCellDS Project >>>>>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>>>>> >>>>>>>> *email*: Pau...@us... / Pau...@Ma... >>>>>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>>>>> >>>>>>>> *mobile*: +1 310-701-5785 >>>>>>>> *FAX*: +1 323-442-2764 >>>>>>>> >>>>>>>> >>>>>>>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith < >>>>>>>> luc...@gm...> wrote: >>>>>>>> >>>>>>>>> OK, let me see if I can summarize the issues about compression, >>>>>>>>> and ask people's opinions moving forward: >>>>>>>>> >>>>>>>>> As things stand right now, the spec itself is a little vague on >>>>>>>>> how compression works. This obviously needs to be updated, but we should >>>>>>>>> make sure we know what we want, first. >>>>>>>>> >>>>>>>>> The libsbml implementation of compression (and used by Frank and >>>>>>>>> Jim) works by compressing a *string* of numbers into a format that can be >>>>>>>>> written into an XML file safely (I still don't know which one, but let's >>>>>>>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>>>>>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>>>>>>> etc. are getting compressed along with everything else. >>>>>>>>> >>>>>>>>> The big advantage of this system is that it's implemented. >>>>>>>>> >>>>>>>>> The disadvantage of this system is that it's fairly inefficient, >>>>>>>>> mostly because encoding a number as a string is inefficient to start with. >>>>>>>>> >>>>>>>>> So that's option #1: keep things as they are implemented now, >>>>>>>>> with possible tweaks for delimiters, etc. >>>>>>>>> >>>>>>>>> >>>>>>>>> For option #2, we could compress the arrays of numbers directly, >>>>>>>>> and encode that compression in the same way in the XML. This would have >>>>>>>>> the advantage of being more compressed, but has the disadvantage of not >>>>>>>>> being implemented yet. >>>>>>>>> >>>>>>>>> >>>>>>>>> For option #3, we could ditch compression entirely, and rely >>>>>>>>> instead on our ability to compress the entire SBML document instead >>>>>>>>> (libsbml has built-in features that let it read and write to compressed >>>>>>>>> documents). This would actually result in smaller files if the numbers >>>>>>>>> were all written out than if those number strings were compressed first a >>>>>>>>> la option #1. This disadvantage of this system is that it makes the files >>>>>>>>> really big, and therefore harder to read/debug the parts that *aren't* huge >>>>>>>>> arrays of numbers. >>>>>>>>> >>>>>>>>> As far as delimiters go, it seemed to me that the simplest option >>>>>>>>> would be to allow a ';' delimiter wherever people wanted it, and to remove >>>>>>>>> it for compression. The order of numbers and their meaning would be >>>>>>>>> precisely defined in the spec, so that special delimiters (besides the >>>>>>>>> space between the numbers themselves) were not strictly needed, but could >>>>>>>>> be provided for readability. >>>>>>>>> >>>>>>>>> Also, keep in mind that if the size of the file itself is an >>>>>>>>> issue, the entire file can be compressed, not just these strings of >>>>>>>>> numbers. The point of compressing the numbers inside the XML file is (I >>>>>>>>> believe) so that the *rest* of the file is easier to view manually. >>>>>>>>> >>>>>>>>> -Lucian >>>>>>>>> >>>>>>>>> >>>>>>>>> ------------------------------------------------------------------------------ >>>>>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>>>>> Monitor 10 servers for $9/Month. >>>>>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>>>>> notifications. >>>>>>>>> Take corrective actions from your mobile device. >>>>>>>>> http://p.sf.net/sfu/Zoho >>>>>>>>> _______________________________________________ >>>>>>>>> sbml-spatial mailing list >>>>>>>>> sbm...@li... >>>>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>>>>> >>>>>>>>> >>>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>>> Monitor 10 servers for $9/Month. >>>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>>> notifications. >>>>>>> Take corrective actions from your mobile device. >>>>>>> http://p.sf.net/sfu/Zoho >>>>>>> _______________________________________________ >>>>>>> sbml-spatial mailing list >>>>>>> sbm...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Comprehensive Server Monitoring with Site24x7. >>>>> Monitor 10 servers for $9/Month. >>>>> Get alerted through email, SMS, voice calls or mobile push >>>>> notifications. >>>>> Take corrective actions from your mobile device. >>>>> http://p.sf.net/sfu/Zoho >>>>> _______________________________________________ >>>>> sbml-spatial mailing list >>>>> sbm...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>> >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Comprehensive Server Monitoring with Site24x7. >>>> Monitor 10 servers for $9/Month. >>>> Get alerted through email, SMS, voice calls or mobile push >>>> notifications. >>>> Take corrective actions from your mobile device. >>>> http://p.sf.net/sfu/Zoho >>>> _______________________________________________ >>>> sbml-spatial mailing list >>>> sbm...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>> >>>> >>> >>> >>> ------------------------------------------------------------------------------ >>> Comprehensive Server Monitoring with Site24x7. >>> Monitor 10 servers for $9/Month. >>> Get alerted through email, SMS, voice calls or mobile push notifications. >>> Take corrective actions from your mobile device. >>> http://p.sf.net/sfu/Zoho >>> _______________________________________________ >>> sbml-spatial mailing list >>> sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>> >>> >> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > > > -- > Dr. Samuel H. Friedman > University of Southern California Postdoctoral Scholar - Research > Associate > Center for Applied Molecular Medicine Keck School of Medicine > Email: sam...@ca... Phone: 323-442-2531 > 2250 Alcazar St Rm 259 Los Angeles, CA 90033 > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Samuel F. <sam...@ca...> - 2014-10-17 23:14:40
|
I agree with what Paul has said. If you're going to do compression, you want to do it once and not multiple times so I would vote for path #2. There are three reasons why you really don't want to go down route #3: 1) Floating point numbers don't compress well generally because they usually have slightly different numbers and hence don't compress well as each one is different. 2) Compression algorithms tend to work better on larger chunks of data because they have more data to look at when trying to figure out what to compress. 3) If you go to compress your SBML file after you've inserted your compressed floating point numbers, you have done a double compression which is almost never worth your while. Sam On Fri, Oct 17, 2014 at 10:13 AM, Paul Macklin <pau...@us...> wrote: > Parsing and postprocessing should be a lot easier and faster if the > compression is within the XML (so the tags are still uncompressed and easy > to parse), rather than enclosing the XML (so you have to decompress the > whole thing prior to parsing and postprocessing / analysis). When the > files are big and you have a lot of them to process, this becomes > significant. > > Not that these are any of your 1-3 per se, but you do talk about sticking > the whole thing into a zip file. We're shying away from that and looking > towards HDF and/or XML + base64 because for 3D and multicell work, the > files become pretty big and the wait for the zip/unzip process can be a > pretty significant bottleneck to analyzing simulation outputs. > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > *email*: Pau...@us... / Pau...@Ma... > *web*: http://MathCancer.org <http://mathcancer.org/> > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > *mobile*: +1 310-701-5785 > *FAX*: +1 323-442-2764 > > > On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...> > wrote: > >> OK, so one of the options can obviously remain 'write the numbers as a >> string, store that in the XML' for readability. For compression, we have: >> >> 1) binary --> string (ftoa) --> compressed string (this is the existing >> scheme) >> 2) binary --> base64 >> 3) binary --> base64 --> compressed string >> >> Andy reports that base64 encoding of binary data is about 30% more >> efficient than string encoding of binary data (ftoa), and also has the >> advantage of being faster to process when decoding. Since ftoa results in >> a smaller character set (0-9,-,e,spaces), you'd recover some of that >> inefficiency if you compared 1) to 3), but probably not all of it. You'd >> also still have the slower decoding step. >> >> The disadvantage of 3) over 2) is that the resulting .zip file of the >> entire document would be slightly larger for 3) than for 2), so the >> question would become: what is the main purpose of encoding the data in >> the file this way? If it's 'smaller file size', you'd go with 2), but if >> it's 'less of the file I have to scroll through when reading it by hand', >> you'd want 3). >> >> Anyone have strong opinions either way? Is this worth an actual poll of >> the community? >> >> -Lucian >> >> On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> >> wrote: >> >>> Interesting! >>> >>> Perhaps a big improvement to use ieee and base64 for all numerical >>> fields and get rid of atof? >>> On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: >>> >>>> A big part of the slowness comes parsing a string to float, I.e. atof. >>>> >>>> Plus atof does not even work the same on different platforms, and >>>> different locales throw in another complication. >>>> >>>> All modern processors use IEE 754 double format, so it's actually a >>>> much more stNdard format than textual formatted numbers. >>>> >>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>> wrote: >>>> >>>>> Thanks, Andy. >>>>> >>>>> Out of curiosity, is that slowness from parsing complexity or from the >>>>> disk read/write itself? Is it still the same bottleneck if reading/writing >>>>> files on a solid state disk or ram disk? >>>>> >>>>> >>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>> Paul Macklin, Ph.D. >>>>> >>>>> Assistant Professor of Research Medicine >>>>> Center for Applied Molecular Medicine >>>>> Keck School of Medicine >>>>> University of Southern California >>>>> Los Angeles, CA >>>>> >>>>> Founder and Co-Lead of the MultiCellDS Project >>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>> >>>>> *email*: Pau...@us... / Pau...@Ma... >>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>> >>>>> *mobile*: +1 310-701-5785 >>>>> *FAX*: +1 323-442-2764 >>>>> >>>>> >>>>> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> >>>>> wrote: >>>>> >>>>>> Just store the binary array as a base64 encoded blob. >>>>>> >>>>>> Not only will the file size be about 30% the size of converting to >>>>>> strings, but it is an order of magnitude faster in terms of parsing and >>>>>> reading the data. >>>>>> >>>>>> In profiling our simulations, currently the slowest part is reading >>>>>> the sbml, so anything that would improve performance in this area would be >>>>>> very usefull. >>>>>> >>>>>> >>>>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>>>> wrote: >>>>>> >>>>>>> It sounds like #1 converts the numbers to strings in a sprintf-like >>>>>>> fashion, and then compresses this string (to another string). >>>>>>> >>>>>>> It sounds like #2 would directly compress the numbers (in their >>>>>>> native binary format), then encode the compressed output as text (e.g., via >>>>>>> base64) >>>>>>> >>>>>>> I was wondering what you thought of a (#1/#2)': encode the >>>>>>> doubles/floats/whatever to text via base64 first, compress this, then store >>>>>>> the resulting text in the data field. >>>>>>> >>>>>>> Thanks -- Paul >>>>>>> >>>>>>> >>>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>>>> Paul Macklin, Ph.D. >>>>>>> >>>>>>> Assistant Professor of Research Medicine >>>>>>> Center for Applied Molecular Medicine >>>>>>> Keck School of Medicine >>>>>>> University of Southern California >>>>>>> Los Angeles, CA >>>>>>> >>>>>>> Founder and Co-Lead of the MultiCellDS Project >>>>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>>>> >>>>>>> *email*: Pau...@us... / Pau...@Ma... >>>>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>>>> >>>>>>> *mobile*: +1 310-701-5785 >>>>>>> *FAX*: +1 323-442-2764 >>>>>>> >>>>>>> >>>>>>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith < >>>>>>> luc...@gm...> wrote: >>>>>>> >>>>>>>> OK, let me see if I can summarize the issues about compression, and >>>>>>>> ask people's opinions moving forward: >>>>>>>> >>>>>>>> As things stand right now, the spec itself is a little vague on how >>>>>>>> compression works. This obviously needs to be updated, but we should make >>>>>>>> sure we know what we want, first. >>>>>>>> >>>>>>>> The libsbml implementation of compression (and used by Frank and >>>>>>>> Jim) works by compressing a *string* of numbers into a format that can be >>>>>>>> written into an XML file safely (I still don't know which one, but let's >>>>>>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>>>>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>>>>>> etc. are getting compressed along with everything else. >>>>>>>> >>>>>>>> The big advantage of this system is that it's implemented. >>>>>>>> >>>>>>>> The disadvantage of this system is that it's fairly inefficient, >>>>>>>> mostly because encoding a number as a string is inefficient to start with. >>>>>>>> >>>>>>>> So that's option #1: keep things as they are implemented now, with >>>>>>>> possible tweaks for delimiters, etc. >>>>>>>> >>>>>>>> >>>>>>>> For option #2, we could compress the arrays of numbers directly, >>>>>>>> and encode that compression in the same way in the XML. This would have >>>>>>>> the advantage of being more compressed, but has the disadvantage of not >>>>>>>> being implemented yet. >>>>>>>> >>>>>>>> >>>>>>>> For option #3, we could ditch compression entirely, and rely >>>>>>>> instead on our ability to compress the entire SBML document instead >>>>>>>> (libsbml has built-in features that let it read and write to compressed >>>>>>>> documents). This would actually result in smaller files if the numbers >>>>>>>> were all written out than if those number strings were compressed first a >>>>>>>> la option #1. This disadvantage of this system is that it makes the files >>>>>>>> really big, and therefore harder to read/debug the parts that *aren't* huge >>>>>>>> arrays of numbers. >>>>>>>> >>>>>>>> As far as delimiters go, it seemed to me that the simplest option >>>>>>>> would be to allow a ';' delimiter wherever people wanted it, and to remove >>>>>>>> it for compression. The order of numbers and their meaning would be >>>>>>>> precisely defined in the spec, so that special delimiters (besides the >>>>>>>> space between the numbers themselves) were not strictly needed, but could >>>>>>>> be provided for readability. >>>>>>>> >>>>>>>> Also, keep in mind that if the size of the file itself is an issue, >>>>>>>> the entire file can be compressed, not just these strings of numbers. The >>>>>>>> point of compressing the numbers inside the XML file is (I believe) so that >>>>>>>> the *rest* of the file is easier to view manually. >>>>>>>> >>>>>>>> -Lucian >>>>>>>> >>>>>>>> >>>>>>>> ------------------------------------------------------------------------------ >>>>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>>>> Monitor 10 servers for $9/Month. >>>>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>>>> notifications. >>>>>>>> Take corrective actions from your mobile device. >>>>>>>> http://p.sf.net/sfu/Zoho >>>>>>>> _______________________________________________ >>>>>>>> sbml-spatial mailing list >>>>>>>> sbm...@li... >>>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>>>> >>>>>>>> >>>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>> Monitor 10 servers for $9/Month. >>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>> notifications. >>>>>> Take corrective actions from your mobile device. >>>>>> http://p.sf.net/sfu/Zoho >>>>>> _______________________________________________ >>>>>> sbml-spatial mailing list >>>>>> sbm...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>> >>>>>> >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Comprehensive Server Monitoring with Site24x7. >>>> Monitor 10 servers for $9/Month. >>>> Get alerted through email, SMS, voice calls or mobile push >>>> notifications. >>>> Take corrective actions from your mobile device. >>>> http://p.sf.net/sfu/Zoho >>>> _______________________________________________ >>>> sbml-spatial mailing list >>>> sbm...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> Comprehensive Server Monitoring with Site24x7. >>> Monitor 10 servers for $9/Month. >>> Get alerted through email, SMS, voice calls or mobile push notifications. >>> Take corrective actions from your mobile device. >>> http://p.sf.net/sfu/Zoho >>> _______________________________________________ >>> sbml-spatial mailing list >>> sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>> >>> >> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > -- Dr. Samuel H. Friedman University of Southern California Postdoctoral Scholar - Research Associate Center for Applied Molecular Medicine Keck School of Medicine Email: sam...@ca... Phone: 323-442-2531 2250 Alcazar St Rm 259 Los Angeles, CA 90033 |
|
From: Lucian S. <luc...@gm...> - 2014-10-17 17:51:29
|
While we're discussing things, I have another unrelated question, which I
mentioned in the most recent version of the spec, but which we have not
discussed on the list (nor did we discuss it at COMBINE).
When you scale a composed object, it makes a difference whether you scale
the composed object, or whether you scale each of the respective primitives.
As an example: imagine you have created two spheres that just touch that
collectively define a single CSGNode. You could do this basically as:
union(sphere(), translate(x=1, sphere()))
Or, in XML:
<spatial:csgObject spatial:domainType="EN" spatial:ordinal="0"
spatial:id="two_spheres">
<spatial:csgSetOperator spatial:operationType="union"
spatial:id="union1">
<spatial:listOfCSGNodes>
<spatial:csgPrimitive primitiveType="sphere"/>
<spatial:csgTranslation spatial:id="translation"
spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0">
<spatial:csgPrimitive primitiveType="sphere"/>
</spatial:csgTranslation>
</spatial:listOfCSGNodes>
</spatial:csgRotation>
</spatial:csgObject>
Now suppose you want to scale this by 1.5 in all directions:
<spatial:csgObject spatial:domainType="EN" spatial:ordinal="0"
spatial:id="two_spheres">
<spatial:csgScale scaleX="1.5" scaleY="1.5" scaleZ="1.5">
<spatial:csgSetOperator spatial:operationType="union"
spatial:id="union1">
<spatial:listOfCSGNodes>
<spatial:csgPrimitive primitiveType="sphere"/>
<spatial:csgTranslation spatial:id="translation"
spatial:translateX="1" spatial:translateY="0" spatial:translateZ="0">
<spatial:csgPrimitive primitiveType="sphere"/>
</spatial:csgTranslation>
</spatial:listOfCSGNodes>
</spatial:csgRotation>
</spatial:csgScale>
</spatial:csgObject>
The resulting shape will look different and be in different positions
depending on how we define things. First, do you scale the entire
two-sphere shape as a whole, or do you scale the individual spheres? If
you scale the entire shape, you will end up with two spheres that again
just touch. If you scale the individual spheres, both will grow,
overlapping each other. The other thing you need to decide is what the
center of the scaling is--if you scale the entire shape centered at the
shape's 'center of mass' (the point where they touch, in this instance),
the two spheres will essentially grow out from that point. If they are
scaled from the origin, the first sphere will simply grow, while the one it
touches will be pushed outward.
In the spec, I wrote that each individual shape should be individually
scaled, which in this case would mean that the spheres would grow and
overlap each other. But that was based entirely of my intuition of what
might be easier to implement, and not on any actual experience. Have any
of you actually done scaling of grouped elements like this? How did you do
it? Did you decide base on what was easy to implement, or based on what a
user would more likely want, or something else?
-Lucian
|
|
From: Paul M. <pau...@us...> - 2014-10-17 17:14:31
|
Parsing and postprocessing should be a lot easier and faster if the compression is within the XML (so the tags are still uncompressed and easy to parse), rather than enclosing the XML (so you have to decompress the whole thing prior to parsing and postprocessing / analysis). When the files are big and you have a lot of them to process, this becomes significant. Not that these are any of your 1-3 per se, but you do talk about sticking the whole thing into a zip file. We're shying away from that and looking towards HDF and/or XML + base64 because for 3D and multicell work, the files become pretty big and the wait for the zip/unzip process can be a pretty significant bottleneck to analyzing simulation outputs. -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / @MultiCellDS <http://www.twitter.com/MultiCellDS> *email*: Pau...@us... / Pau...@Ma... *web*: http://MathCancer.org <http://mathcancer.org/> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> *mobile*: +1 310-701-5785 *FAX*: +1 323-442-2764 On Fri, Oct 17, 2014 at 9:58 AM, Lucian Smith <luc...@gm...> wrote: > OK, so one of the options can obviously remain 'write the numbers as a > string, store that in the XML' for readability. For compression, we have: > > 1) binary --> string (ftoa) --> compressed string (this is the existing > scheme) > 2) binary --> base64 > 3) binary --> base64 --> compressed string > > Andy reports that base64 encoding of binary data is about 30% more > efficient than string encoding of binary data (ftoa), and also has the > advantage of being faster to process when decoding. Since ftoa results in > a smaller character set (0-9,-,e,spaces), you'd recover some of that > inefficiency if you compared 1) to 3), but probably not all of it. You'd > also still have the slower decoding step. > > The disadvantage of 3) over 2) is that the resulting .zip file of the > entire document would be slightly larger for 3) than for 2), so the > question would become: what is the main purpose of encoding the data in > the file this way? If it's 'smaller file size', you'd go with 2), but if > it's 'less of the file I have to scroll through when reading it by hand', > you'd want 3). > > Anyone have strong opinions either way? Is this worth an actual poll of > the community? > > -Lucian > > On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> > wrote: > >> Interesting! >> >> Perhaps a big improvement to use ieee and base64 for all numerical fields >> and get rid of atof? >> On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: >> >>> A big part of the slowness comes parsing a string to float, I.e. atof. >>> >>> Plus atof does not even work the same on different platforms, and >>> different locales throw in another complication. >>> >>> All modern processors use IEE 754 double format, so it's actually a much >>> more stNdard format than textual formatted numbers. >>> >>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>> wrote: >>> >>>> Thanks, Andy. >>>> >>>> Out of curiosity, is that slowness from parsing complexity or from the >>>> disk read/write itself? Is it still the same bottleneck if reading/writing >>>> files on a solid state disk or ram disk? >>>> >>>> >>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>> Paul Macklin, Ph.D. >>>> >>>> Assistant Professor of Research Medicine >>>> Center for Applied Molecular Medicine >>>> Keck School of Medicine >>>> University of Southern California >>>> Los Angeles, CA >>>> >>>> Founder and Co-Lead of the MultiCellDS Project >>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>> >>>> *email*: Pau...@us... / Pau...@Ma... >>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>> >>>> *mobile*: +1 310-701-5785 >>>> *FAX*: +1 323-442-2764 >>>> >>>> >>>> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> >>>> wrote: >>>> >>>>> Just store the binary array as a base64 encoded blob. >>>>> >>>>> Not only will the file size be about 30% the size of converting to >>>>> strings, but it is an order of magnitude faster in terms of parsing and >>>>> reading the data. >>>>> >>>>> In profiling our simulations, currently the slowest part is reading >>>>> the sbml, so anything that would improve performance in this area would be >>>>> very usefull. >>>>> >>>>> >>>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>>> wrote: >>>>> >>>>>> It sounds like #1 converts the numbers to strings in a sprintf-like >>>>>> fashion, and then compresses this string (to another string). >>>>>> >>>>>> It sounds like #2 would directly compress the numbers (in their >>>>>> native binary format), then encode the compressed output as text (e.g., via >>>>>> base64) >>>>>> >>>>>> I was wondering what you thought of a (#1/#2)': encode the >>>>>> doubles/floats/whatever to text via base64 first, compress this, then store >>>>>> the resulting text in the data field. >>>>>> >>>>>> Thanks -- Paul >>>>>> >>>>>> >>>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>>> Paul Macklin, Ph.D. >>>>>> >>>>>> Assistant Professor of Research Medicine >>>>>> Center for Applied Molecular Medicine >>>>>> Keck School of Medicine >>>>>> University of Southern California >>>>>> Los Angeles, CA >>>>>> >>>>>> Founder and Co-Lead of the MultiCellDS Project >>>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>>> >>>>>> *email*: Pau...@us... / Pau...@Ma... >>>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>>> >>>>>> *mobile*: +1 310-701-5785 >>>>>> *FAX*: +1 323-442-2764 >>>>>> >>>>>> >>>>>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith < >>>>>> luc...@gm...> wrote: >>>>>> >>>>>>> OK, let me see if I can summarize the issues about compression, and >>>>>>> ask people's opinions moving forward: >>>>>>> >>>>>>> As things stand right now, the spec itself is a little vague on how >>>>>>> compression works. This obviously needs to be updated, but we should make >>>>>>> sure we know what we want, first. >>>>>>> >>>>>>> The libsbml implementation of compression (and used by Frank and >>>>>>> Jim) works by compressing a *string* of numbers into a format that can be >>>>>>> written into an XML file safely (I still don't know which one, but let's >>>>>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>>>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>>>>> etc. are getting compressed along with everything else. >>>>>>> >>>>>>> The big advantage of this system is that it's implemented. >>>>>>> >>>>>>> The disadvantage of this system is that it's fairly inefficient, >>>>>>> mostly because encoding a number as a string is inefficient to start with. >>>>>>> >>>>>>> So that's option #1: keep things as they are implemented now, with >>>>>>> possible tweaks for delimiters, etc. >>>>>>> >>>>>>> >>>>>>> For option #2, we could compress the arrays of numbers directly, and >>>>>>> encode that compression in the same way in the XML. This would have the >>>>>>> advantage of being more compressed, but has the disadvantage of not being >>>>>>> implemented yet. >>>>>>> >>>>>>> >>>>>>> For option #3, we could ditch compression entirely, and rely instead >>>>>>> on our ability to compress the entire SBML document instead (libsbml has >>>>>>> built-in features that let it read and write to compressed documents). >>>>>>> This would actually result in smaller files if the numbers were all written >>>>>>> out than if those number strings were compressed first a la option #1. >>>>>>> This disadvantage of this system is that it makes the files really big, and >>>>>>> therefore harder to read/debug the parts that *aren't* huge arrays of >>>>>>> numbers. >>>>>>> >>>>>>> As far as delimiters go, it seemed to me that the simplest option >>>>>>> would be to allow a ';' delimiter wherever people wanted it, and to remove >>>>>>> it for compression. The order of numbers and their meaning would be >>>>>>> precisely defined in the spec, so that special delimiters (besides the >>>>>>> space between the numbers themselves) were not strictly needed, but could >>>>>>> be provided for readability. >>>>>>> >>>>>>> Also, keep in mind that if the size of the file itself is an issue, >>>>>>> the entire file can be compressed, not just these strings of numbers. The >>>>>>> point of compressing the numbers inside the XML file is (I believe) so that >>>>>>> the *rest* of the file is easier to view manually. >>>>>>> >>>>>>> -Lucian >>>>>>> >>>>>>> >>>>>>> ------------------------------------------------------------------------------ >>>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>>> Monitor 10 servers for $9/Month. >>>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>>> notifications. >>>>>>> Take corrective actions from your mobile device. >>>>>>> http://p.sf.net/sfu/Zoho >>>>>>> _______________________________________________ >>>>>>> sbml-spatial mailing list >>>>>>> sbm...@li... >>>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>>> >>>>>>> >>>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Comprehensive Server Monitoring with Site24x7. >>>>> Monitor 10 servers for $9/Month. >>>>> Get alerted through email, SMS, voice calls or mobile push >>>>> notifications. >>>>> Take corrective actions from your mobile device. >>>>> http://p.sf.net/sfu/Zoho >>>>> _______________________________________________ >>>>> sbml-spatial mailing list >>>>> sbm...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>> >>>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> Comprehensive Server Monitoring with Site24x7. >>> Monitor 10 servers for $9/Month. >>> Get alerted through email, SMS, voice calls or mobile push notifications. >>> Take corrective actions from your mobile device. >>> http://p.sf.net/sfu/Zoho >>> _______________________________________________ >>> sbml-spatial mailing list >>> sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>> >>> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Lucian S. <luc...@gm...> - 2014-10-17 16:58:10
|
OK, so one of the options can obviously remain 'write the numbers as a string, store that in the XML' for readability. For compression, we have: 1) binary --> string (ftoa) --> compressed string (this is the existing scheme) 2) binary --> base64 3) binary --> base64 --> compressed string Andy reports that base64 encoding of binary data is about 30% more efficient than string encoding of binary data (ftoa), and also has the advantage of being faster to process when decoding. Since ftoa results in a smaller character set (0-9,-,e,spaces), you'd recover some of that inefficiency if you compared 1) to 3), but probably not all of it. You'd also still have the slower decoding step. The disadvantage of 3) over 2) is that the resulting .zip file of the entire document would be slightly larger for 3) than for 2), so the question would become: what is the main purpose of encoding the data in the file this way? If it's 'smaller file size', you'd go with 2), but if it's 'less of the file I have to scroll through when reading it by hand', you'd want 3). Anyone have strong opinions either way? Is this worth an actual poll of the community? -Lucian On Wed, Oct 15, 2014 at 6:43 PM, Paul Macklin <pau...@us...> wrote: > Interesting! > > Perhaps a big improvement to use ieee and base64 for all numerical fields > and get rid of atof? > On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: > >> A big part of the slowness comes parsing a string to float, I.e. atof. >> >> Plus atof does not even work the same on different platforms, and >> different locales throw in another complication. >> >> All modern processors use IEE 754 double format, so it's actually a much >> more stNdard format than textual formatted numbers. >> >> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >> wrote: >> >>> Thanks, Andy. >>> >>> Out of curiosity, is that slowness from parsing complexity or from the >>> disk read/write itself? Is it still the same bottleneck if reading/writing >>> files on a solid state disk or ram disk? >>> >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> Paul Macklin, Ph.D. >>> >>> Assistant Professor of Research Medicine >>> Center for Applied Molecular Medicine >>> Keck School of Medicine >>> University of Southern California >>> Los Angeles, CA >>> >>> Founder and Co-Lead of the MultiCellDS Project >>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>> >>> *email*: Pau...@us... / Pau...@Ma... >>> *web*: http://MathCancer.org <http://mathcancer.org/> >>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>> >>> *mobile*: +1 310-701-5785 >>> *FAX*: +1 323-442-2764 >>> >>> >>> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> >>> wrote: >>> >>>> Just store the binary array as a base64 encoded blob. >>>> >>>> Not only will the file size be about 30% the size of converting to >>>> strings, but it is an order of magnitude faster in terms of parsing and >>>> reading the data. >>>> >>>> In profiling our simulations, currently the slowest part is reading the >>>> sbml, so anything that would improve performance in this area would be very >>>> usefull. >>>> >>>> >>>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>>> wrote: >>>> >>>>> It sounds like #1 converts the numbers to strings in a sprintf-like >>>>> fashion, and then compresses this string (to another string). >>>>> >>>>> It sounds like #2 would directly compress the numbers (in their native >>>>> binary format), then encode the compressed output as text (e.g., via base64) >>>>> >>>>> I was wondering what you thought of a (#1/#2)': encode the >>>>> doubles/floats/whatever to text via base64 first, compress this, then store >>>>> the resulting text in the data field. >>>>> >>>>> Thanks -- Paul >>>>> >>>>> >>>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>>> Paul Macklin, Ph.D. >>>>> >>>>> Assistant Professor of Research Medicine >>>>> Center for Applied Molecular Medicine >>>>> Keck School of Medicine >>>>> University of Southern California >>>>> Los Angeles, CA >>>>> >>>>> Founder and Co-Lead of the MultiCellDS Project >>>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>>> >>>>> *email*: Pau...@us... / Pau...@Ma... >>>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>>> >>>>> *mobile*: +1 310-701-5785 >>>>> *FAX*: +1 323-442-2764 >>>>> >>>>> >>>>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith < >>>>> luc...@gm...> wrote: >>>>> >>>>>> OK, let me see if I can summarize the issues about compression, and >>>>>> ask people's opinions moving forward: >>>>>> >>>>>> As things stand right now, the spec itself is a little vague on how >>>>>> compression works. This obviously needs to be updated, but we should make >>>>>> sure we know what we want, first. >>>>>> >>>>>> The libsbml implementation of compression (and used by Frank and Jim) >>>>>> works by compressing a *string* of numbers into a format that can be >>>>>> written into an XML file safely (I still don't know which one, but let's >>>>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>>>> etc. are getting compressed along with everything else. >>>>>> >>>>>> The big advantage of this system is that it's implemented. >>>>>> >>>>>> The disadvantage of this system is that it's fairly inefficient, >>>>>> mostly because encoding a number as a string is inefficient to start with. >>>>>> >>>>>> So that's option #1: keep things as they are implemented now, with >>>>>> possible tweaks for delimiters, etc. >>>>>> >>>>>> >>>>>> For option #2, we could compress the arrays of numbers directly, and >>>>>> encode that compression in the same way in the XML. This would have the >>>>>> advantage of being more compressed, but has the disadvantage of not being >>>>>> implemented yet. >>>>>> >>>>>> >>>>>> For option #3, we could ditch compression entirely, and rely instead >>>>>> on our ability to compress the entire SBML document instead (libsbml has >>>>>> built-in features that let it read and write to compressed documents). >>>>>> This would actually result in smaller files if the numbers were all written >>>>>> out than if those number strings were compressed first a la option #1. >>>>>> This disadvantage of this system is that it makes the files really big, and >>>>>> therefore harder to read/debug the parts that *aren't* huge arrays of >>>>>> numbers. >>>>>> >>>>>> As far as delimiters go, it seemed to me that the simplest option >>>>>> would be to allow a ';' delimiter wherever people wanted it, and to remove >>>>>> it for compression. The order of numbers and their meaning would be >>>>>> precisely defined in the spec, so that special delimiters (besides the >>>>>> space between the numbers themselves) were not strictly needed, but could >>>>>> be provided for readability. >>>>>> >>>>>> Also, keep in mind that if the size of the file itself is an issue, >>>>>> the entire file can be compressed, not just these strings of numbers. The >>>>>> point of compressing the numbers inside the XML file is (I believe) so that >>>>>> the *rest* of the file is easier to view manually. >>>>>> >>>>>> -Lucian >>>>>> >>>>>> >>>>>> ------------------------------------------------------------------------------ >>>>>> Comprehensive Server Monitoring with Site24x7. >>>>>> Monitor 10 servers for $9/Month. >>>>>> Get alerted through email, SMS, voice calls or mobile push >>>>>> notifications. >>>>>> Take corrective actions from your mobile device. >>>>>> http://p.sf.net/sfu/Zoho >>>>>> _______________________________________________ >>>>>> sbml-spatial mailing list >>>>>> sbm...@li... >>>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>>> >>>>>> >>>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Comprehensive Server Monitoring with Site24x7. >>>> Monitor 10 servers for $9/Month. >>>> Get alerted through email, SMS, voice calls or mobile push >>>> notifications. >>>> Take corrective actions from your mobile device. >>>> http://p.sf.net/sfu/Zoho >>>> _______________________________________________ >>>> sbml-spatial mailing list >>>> sbm...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>> >>>> >>> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Paul M. <pau...@us...> - 2014-10-16 01:43:49
|
Interesting! Perhaps a big improvement to use ieee and base64 for all numerical fields and get rid of atof? On Oct 15, 2014 6:29 PM, "Andy Somogyi" <and...@gm...> wrote: > A big part of the slowness comes parsing a string to float, I.e. atof. > > Plus atof does not even work the same on different platforms, and > different locales throw in another complication. > > All modern processors use IEE 754 double format, so it's actually a much > more stNdard format than textual formatted numbers. > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote: > >> Thanks, Andy. >> >> Out of curiosity, is that slowness from parsing complexity or from the >> disk read/write itself? Is it still the same bottleneck if reading/writing >> files on a solid state disk or ram disk? >> >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Paul Macklin, Ph.D. >> >> Assistant Professor of Research Medicine >> Center for Applied Molecular Medicine >> Keck School of Medicine >> University of Southern California >> Los Angeles, CA >> >> Founder and Co-Lead of the MultiCellDS Project >> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >> @MultiCellDS <http://www.twitter.com/MultiCellDS> >> >> *email*: Pau...@us... / Pau...@Ma... >> *web*: http://MathCancer.org <http://mathcancer.org/> >> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >> >> *mobile*: +1 310-701-5785 >> *FAX*: +1 323-442-2764 >> >> >> On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> >> wrote: >> >>> Just store the binary array as a base64 encoded blob. >>> >>> Not only will the file size be about 30% the size of converting to >>> strings, but it is an order of magnitude faster in terms of parsing and >>> reading the data. >>> >>> In profiling our simulations, currently the slowest part is reading the >>> sbml, so anything that would improve performance in this area would be very >>> usefull. >>> >>> >>> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >>> wrote: >>> >>>> It sounds like #1 converts the numbers to strings in a sprintf-like >>>> fashion, and then compresses this string (to another string). >>>> >>>> It sounds like #2 would directly compress the numbers (in their native >>>> binary format), then encode the compressed output as text (e.g., via base64) >>>> >>>> I was wondering what you thought of a (#1/#2)': encode the >>>> doubles/floats/whatever to text via base64 first, compress this, then store >>>> the resulting text in the data field. >>>> >>>> Thanks -- Paul >>>> >>>> >>>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>>> Paul Macklin, Ph.D. >>>> >>>> Assistant Professor of Research Medicine >>>> Center for Applied Molecular Medicine >>>> Keck School of Medicine >>>> University of Southern California >>>> Los Angeles, CA >>>> >>>> Founder and Co-Lead of the MultiCellDS Project >>>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>>> >>>> *email*: Pau...@us... / Pau...@Ma... >>>> *web*: http://MathCancer.org <http://mathcancer.org/> >>>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>>> >>>> *mobile*: +1 310-701-5785 >>>> *FAX*: +1 323-442-2764 >>>> >>>> >>>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith < >>>> luc...@gm...> wrote: >>>> >>>>> OK, let me see if I can summarize the issues about compression, and >>>>> ask people's opinions moving forward: >>>>> >>>>> As things stand right now, the spec itself is a little vague on how >>>>> compression works. This obviously needs to be updated, but we should make >>>>> sure we know what we want, first. >>>>> >>>>> The libsbml implementation of compression (and used by Frank and Jim) >>>>> works by compressing a *string* of numbers into a format that can be >>>>> written into an XML file safely (I still don't know which one, but let's >>>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>>> etc. are getting compressed along with everything else. >>>>> >>>>> The big advantage of this system is that it's implemented. >>>>> >>>>> The disadvantage of this system is that it's fairly inefficient, >>>>> mostly because encoding a number as a string is inefficient to start with. >>>>> >>>>> So that's option #1: keep things as they are implemented now, with >>>>> possible tweaks for delimiters, etc. >>>>> >>>>> >>>>> For option #2, we could compress the arrays of numbers directly, and >>>>> encode that compression in the same way in the XML. This would have the >>>>> advantage of being more compressed, but has the disadvantage of not being >>>>> implemented yet. >>>>> >>>>> >>>>> For option #3, we could ditch compression entirely, and rely instead >>>>> on our ability to compress the entire SBML document instead (libsbml has >>>>> built-in features that let it read and write to compressed documents). >>>>> This would actually result in smaller files if the numbers were all written >>>>> out than if those number strings were compressed first a la option #1. >>>>> This disadvantage of this system is that it makes the files really big, and >>>>> therefore harder to read/debug the parts that *aren't* huge arrays of >>>>> numbers. >>>>> >>>>> As far as delimiters go, it seemed to me that the simplest option >>>>> would be to allow a ';' delimiter wherever people wanted it, and to remove >>>>> it for compression. The order of numbers and their meaning would be >>>>> precisely defined in the spec, so that special delimiters (besides the >>>>> space between the numbers themselves) were not strictly needed, but could >>>>> be provided for readability. >>>>> >>>>> Also, keep in mind that if the size of the file itself is an issue, >>>>> the entire file can be compressed, not just these strings of numbers. The >>>>> point of compressing the numbers inside the XML file is (I believe) so that >>>>> the *rest* of the file is easier to view manually. >>>>> >>>>> -Lucian >>>>> >>>>> >>>>> ------------------------------------------------------------------------------ >>>>> Comprehensive Server Monitoring with Site24x7. >>>>> Monitor 10 servers for $9/Month. >>>>> Get alerted through email, SMS, voice calls or mobile push >>>>> notifications. >>>>> Take corrective actions from your mobile device. >>>>> http://p.sf.net/sfu/Zoho >>>>> _______________________________________________ >>>>> sbml-spatial mailing list >>>>> sbm...@li... >>>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>>> >>>>> >>>> >>> >>> ------------------------------------------------------------------------------ >>> Comprehensive Server Monitoring with Site24x7. >>> Monitor 10 servers for $9/Month. >>> Get alerted through email, SMS, voice calls or mobile push notifications. >>> Take corrective actions from your mobile device. >>> http://p.sf.net/sfu/Zoho >>> _______________________________________________ >>> sbml-spatial mailing list >>> sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>> >>> >> > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Andy S. <and...@gm...> - 2014-10-16 01:28:48
|
A big part of the slowness comes parsing a string to float, I.e. atof. Plus atof does not even work the same on different platforms, and different locales throw in another complication. All modern processors use IEE 754 double format, so it's actually a much more stNdard format than textual formatted numbers. On Wednesday, October 15, 2014, Paul Macklin <pau...@us... <javascript:_e(%7B%7D,'cvml','pau...@us...');>> wrote: > Thanks, Andy. > > Out of curiosity, is that slowness from parsing complexity or from the > disk read/write itself? Is it still the same bottleneck if reading/writing > files on a solid state disk or ram disk? > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > *email*: Pau...@us... / Pau...@Ma... > *web*: http://MathCancer.org <http://mathcancer.org/> > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > *mobile*: +1 310-701-5785 > *FAX*: +1 323-442-2764 > > > On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> > wrote: > >> Just store the binary array as a base64 encoded blob. >> >> Not only will the file size be about 30% the size of converting to >> strings, but it is an order of magnitude faster in terms of parsing and >> reading the data. >> >> In profiling our simulations, currently the slowest part is reading the >> sbml, so anything that would improve performance in this area would be very >> usefull. >> >> >> On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> >> wrote: >> >>> It sounds like #1 converts the numbers to strings in a sprintf-like >>> fashion, and then compresses this string (to another string). >>> >>> It sounds like #2 would directly compress the numbers (in their native >>> binary format), then encode the compressed output as text (e.g., via base64) >>> >>> I was wondering what you thought of a (#1/#2)': encode the >>> doubles/floats/whatever to text via base64 first, compress this, then store >>> the resulting text in the data field. >>> >>> Thanks -- Paul >>> >>> >>> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >>> Paul Macklin, Ph.D. >>> >>> Assistant Professor of Research Medicine >>> Center for Applied Molecular Medicine >>> Keck School of Medicine >>> University of Southern California >>> Los Angeles, CA >>> >>> Founder and Co-Lead of the MultiCellDS Project >>> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >>> @MultiCellDS <http://www.twitter.com/MultiCellDS> >>> >>> *email*: Pau...@us... / Pau...@Ma... >>> *web*: http://MathCancer.org <http://mathcancer.org/> >>> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >>> >>> *mobile*: +1 310-701-5785 >>> *FAX*: +1 323-442-2764 >>> >>> >>> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm... >>> > wrote: >>> >>>> OK, let me see if I can summarize the issues about compression, and ask >>>> people's opinions moving forward: >>>> >>>> As things stand right now, the spec itself is a little vague on how >>>> compression works. This obviously needs to be updated, but we should make >>>> sure we know what we want, first. >>>> >>>> The libsbml implementation of compression (and used by Frank and Jim) >>>> works by compressing a *string* of numbers into a format that can be >>>> written into an XML file safely (I still don't know which one, but let's >>>> assume that this, at least, doesn't need to be changed). This is why Frank >>>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>>> etc. are getting compressed along with everything else. >>>> >>>> The big advantage of this system is that it's implemented. >>>> >>>> The disadvantage of this system is that it's fairly inefficient, mostly >>>> because encoding a number as a string is inefficient to start with. >>>> >>>> So that's option #1: keep things as they are implemented now, with >>>> possible tweaks for delimiters, etc. >>>> >>>> >>>> For option #2, we could compress the arrays of numbers directly, and >>>> encode that compression in the same way in the XML. This would have the >>>> advantage of being more compressed, but has the disadvantage of not being >>>> implemented yet. >>>> >>>> >>>> For option #3, we could ditch compression entirely, and rely instead on >>>> our ability to compress the entire SBML document instead (libsbml has >>>> built-in features that let it read and write to compressed documents). >>>> This would actually result in smaller files if the numbers were all written >>>> out than if those number strings were compressed first a la option #1. >>>> This disadvantage of this system is that it makes the files really big, and >>>> therefore harder to read/debug the parts that *aren't* huge arrays of >>>> numbers. >>>> >>>> As far as delimiters go, it seemed to me that the simplest option would >>>> be to allow a ';' delimiter wherever people wanted it, and to remove it for >>>> compression. The order of numbers and their meaning would be precisely >>>> defined in the spec, so that special delimiters (besides the space between >>>> the numbers themselves) were not strictly needed, but could be provided for >>>> readability. >>>> >>>> Also, keep in mind that if the size of the file itself is an issue, the >>>> entire file can be compressed, not just these strings of numbers. The >>>> point of compressing the numbers inside the XML file is (I believe) so that >>>> the *rest* of the file is easier to view manually. >>>> >>>> -Lucian >>>> >>>> >>>> ------------------------------------------------------------------------------ >>>> Comprehensive Server Monitoring with Site24x7. >>>> Monitor 10 servers for $9/Month. >>>> Get alerted through email, SMS, voice calls or mobile push >>>> notifications. >>>> Take corrective actions from your mobile device. >>>> http://p.sf.net/sfu/Zoho >>>> _______________________________________________ >>>> sbml-spatial mailing list >>>> sbm...@li... >>>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>>> >>>> >>> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > |
|
From: Paul M. <pau...@us...> - 2014-10-16 01:14:02
|
Thanks, Andy. Out of curiosity, is that slowness from parsing complexity or from the disk read/write itself? Is it still the same bottleneck if reading/writing files on a solid state disk or ram disk? -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / @MultiCellDS <http://www.twitter.com/MultiCellDS> *email*: Pau...@us... / Pau...@Ma... *web*: http://MathCancer.org <http://mathcancer.org/> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> *mobile*: +1 310-701-5785 *FAX*: +1 323-442-2764 On Wed, Oct 15, 2014 at 6:08 PM, Andy Somogyi <and...@gm...> wrote: > Just store the binary array as a base64 encoded blob. > > Not only will the file size be about 30% the size of converting to > strings, but it is an order of magnitude faster in terms of parsing and > reading the data. > > In profiling our simulations, currently the slowest part is reading the > sbml, so anything that would improve performance in this area would be very > usefull. > > > On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote: > >> It sounds like #1 converts the numbers to strings in a sprintf-like >> fashion, and then compresses this string (to another string). >> >> It sounds like #2 would directly compress the numbers (in their native >> binary format), then encode the compressed output as text (e.g., via base64) >> >> I was wondering what you thought of a (#1/#2)': encode the >> doubles/floats/whatever to text via base64 first, compress this, then store >> the resulting text in the data field. >> >> Thanks -- Paul >> >> >> -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- >> Paul Macklin, Ph.D. >> >> Assistant Professor of Research Medicine >> Center for Applied Molecular Medicine >> Keck School of Medicine >> University of Southern California >> Los Angeles, CA >> >> Founder and Co-Lead of the MultiCellDS Project >> *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / >> @MultiCellDS <http://www.twitter.com/MultiCellDS> >> >> *email*: Pau...@us... / Pau...@Ma... >> *web*: http://MathCancer.org <http://mathcancer.org/> >> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> >> >> *mobile*: +1 310-701-5785 >> *FAX*: +1 323-442-2764 >> >> >> On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...> >> wrote: >> >>> OK, let me see if I can summarize the issues about compression, and ask >>> people's opinions moving forward: >>> >>> As things stand right now, the spec itself is a little vague on how >>> compression works. This obviously needs to be updated, but we should make >>> sure we know what we want, first. >>> >>> The libsbml implementation of compression (and used by Frank and Jim) >>> works by compressing a *string* of numbers into a format that can be >>> written into an XML file safely (I still don't know which one, but let's >>> assume that this, at least, doesn't need to be changed). This is why Frank >>> is concerned about the delimiter or lack thereof: all spaces, delimiters, >>> etc. are getting compressed along with everything else. >>> >>> The big advantage of this system is that it's implemented. >>> >>> The disadvantage of this system is that it's fairly inefficient, mostly >>> because encoding a number as a string is inefficient to start with. >>> >>> So that's option #1: keep things as they are implemented now, with >>> possible tweaks for delimiters, etc. >>> >>> >>> For option #2, we could compress the arrays of numbers directly, and >>> encode that compression in the same way in the XML. This would have the >>> advantage of being more compressed, but has the disadvantage of not being >>> implemented yet. >>> >>> >>> For option #3, we could ditch compression entirely, and rely instead on >>> our ability to compress the entire SBML document instead (libsbml has >>> built-in features that let it read and write to compressed documents). >>> This would actually result in smaller files if the numbers were all written >>> out than if those number strings were compressed first a la option #1. >>> This disadvantage of this system is that it makes the files really big, and >>> therefore harder to read/debug the parts that *aren't* huge arrays of >>> numbers. >>> >>> As far as delimiters go, it seemed to me that the simplest option would >>> be to allow a ';' delimiter wherever people wanted it, and to remove it for >>> compression. The order of numbers and their meaning would be precisely >>> defined in the spec, so that special delimiters (besides the space between >>> the numbers themselves) were not strictly needed, but could be provided for >>> readability. >>> >>> Also, keep in mind that if the size of the file itself is an issue, the >>> entire file can be compressed, not just these strings of numbers. The >>> point of compressing the numbers inside the XML file is (I believe) so that >>> the *rest* of the file is easier to view manually. >>> >>> -Lucian >>> >>> >>> ------------------------------------------------------------------------------ >>> Comprehensive Server Monitoring with Site24x7. >>> Monitor 10 servers for $9/Month. >>> Get alerted through email, SMS, voice calls or mobile push notifications. >>> Take corrective actions from your mobile device. >>> http://p.sf.net/sfu/Zoho >>> _______________________________________________ >>> sbml-spatial mailing list >>> sbm...@li... >>> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >>> >>> >> > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Andy S. <and...@gm...> - 2014-10-16 01:09:03
|
Just store the binary array as a base64 encoded blob. Not only will the file size be about 30% the size of converting to strings, but it is an order of magnitude faster in terms of parsing and reading the data. In profiling our simulations, currently the slowest part is reading the sbml, so anything that would improve performance in this area would be very usefull. On Wednesday, October 15, 2014, Paul Macklin <pau...@us...> wrote: > It sounds like #1 converts the numbers to strings in a sprintf-like > fashion, and then compresses this string (to another string). > > It sounds like #2 would directly compress the numbers (in their native > binary format), then encode the compressed output as text (e.g., via base64) > > I was wondering what you thought of a (#1/#2)': encode the > doubles/floats/whatever to text via base64 first, compress this, then store > the resulting text in the data field. > > Thanks -- Paul > > > -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- > Paul Macklin, Ph.D. > > Assistant Professor of Research Medicine > Center for Applied Molecular Medicine > Keck School of Medicine > University of Southern California > Los Angeles, CA > > Founder and Co-Lead of the MultiCellDS Project > *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / > @MultiCellDS <http://www.twitter.com/MultiCellDS> > > *email*: Pau...@us... > <javascript:_e(%7B%7D,'cvml','Pau...@us...');> / > Pau...@Ma... > <javascript:_e(%7B%7D,'cvml','Pau...@Ma...');> > *web*: http://MathCancer.org <http://mathcancer.org/> > *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> > > *mobile*: +1 310-701-5785 > *FAX*: +1 323-442-2764 > > > On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm... > <javascript:_e(%7B%7D,'cvml','luc...@gm...');>> wrote: > >> OK, let me see if I can summarize the issues about compression, and ask >> people's opinions moving forward: >> >> As things stand right now, the spec itself is a little vague on how >> compression works. This obviously needs to be updated, but we should make >> sure we know what we want, first. >> >> The libsbml implementation of compression (and used by Frank and Jim) >> works by compressing a *string* of numbers into a format that can be >> written into an XML file safely (I still don't know which one, but let's >> assume that this, at least, doesn't need to be changed). This is why Frank >> is concerned about the delimiter or lack thereof: all spaces, delimiters, >> etc. are getting compressed along with everything else. >> >> The big advantage of this system is that it's implemented. >> >> The disadvantage of this system is that it's fairly inefficient, mostly >> because encoding a number as a string is inefficient to start with. >> >> So that's option #1: keep things as they are implemented now, with >> possible tweaks for delimiters, etc. >> >> >> For option #2, we could compress the arrays of numbers directly, and >> encode that compression in the same way in the XML. This would have the >> advantage of being more compressed, but has the disadvantage of not being >> implemented yet. >> >> >> For option #3, we could ditch compression entirely, and rely instead on >> our ability to compress the entire SBML document instead (libsbml has >> built-in features that let it read and write to compressed documents). >> This would actually result in smaller files if the numbers were all written >> out than if those number strings were compressed first a la option #1. >> This disadvantage of this system is that it makes the files really big, and >> therefore harder to read/debug the parts that *aren't* huge arrays of >> numbers. >> >> As far as delimiters go, it seemed to me that the simplest option would >> be to allow a ';' delimiter wherever people wanted it, and to remove it for >> compression. The order of numbers and their meaning would be precisely >> defined in the spec, so that special delimiters (besides the space between >> the numbers themselves) were not strictly needed, but could be provided for >> readability. >> >> Also, keep in mind that if the size of the file itself is an issue, the >> entire file can be compressed, not just these strings of numbers. The >> point of compressing the numbers inside the XML file is (I believe) so that >> the *rest* of the file is easier to view manually. >> >> -Lucian >> >> >> ------------------------------------------------------------------------------ >> Comprehensive Server Monitoring with Site24x7. >> Monitor 10 servers for $9/Month. >> Get alerted through email, SMS, voice calls or mobile push notifications. >> Take corrective actions from your mobile device. >> http://p.sf.net/sfu/Zoho >> _______________________________________________ >> sbml-spatial mailing list >> sbm...@li... >> <javascript:_e(%7B%7D,'cvml','sbm...@li...');> >> https://lists.sourceforge.net/lists/listinfo/sbml-spatial >> >> > |
|
From: Paul M. <pau...@us...> - 2014-10-16 00:59:15
|
It sounds like #1 converts the numbers to strings in a sprintf-like fashion, and then compresses this string (to another string). It sounds like #2 would directly compress the numbers (in their native binary format), then encode the compressed output as text (e.g., via base64) I was wondering what you thought of a (#1/#2)': encode the doubles/floats/whatever to text via base64 first, compress this, then store the resulting text in the data field. Thanks -- Paul -=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=-=- Paul Macklin, Ph.D. Assistant Professor of Research Medicine Center for Applied Molecular Medicine Keck School of Medicine University of Southern California Los Angeles, CA Founder and Co-Lead of the MultiCellDS Project *MultiCellDS*: http://MultiCellDS.org <http://multicellds.org/> / @MultiCellDS <http://www.twitter.com/MultiCellDS> *email*: Pau...@us... / Pau...@Ma... *web*: http://MathCancer.org <http://mathcancer.org/> *Twitter*: @MathCancer <http://www.twitter.com/MathCancer> *mobile*: +1 310-701-5785 *FAX*: +1 323-442-2764 On Wed, Oct 15, 2014 at 4:23 PM, Lucian Smith <luc...@gm...> wrote: > OK, let me see if I can summarize the issues about compression, and ask > people's opinions moving forward: > > As things stand right now, the spec itself is a little vague on how > compression works. This obviously needs to be updated, but we should make > sure we know what we want, first. > > The libsbml implementation of compression (and used by Frank and Jim) > works by compressing a *string* of numbers into a format that can be > written into an XML file safely (I still don't know which one, but let's > assume that this, at least, doesn't need to be changed). This is why Frank > is concerned about the delimiter or lack thereof: all spaces, delimiters, > etc. are getting compressed along with everything else. > > The big advantage of this system is that it's implemented. > > The disadvantage of this system is that it's fairly inefficient, mostly > because encoding a number as a string is inefficient to start with. > > So that's option #1: keep things as they are implemented now, with > possible tweaks for delimiters, etc. > > > For option #2, we could compress the arrays of numbers directly, and > encode that compression in the same way in the XML. This would have the > advantage of being more compressed, but has the disadvantage of not being > implemented yet. > > > For option #3, we could ditch compression entirely, and rely instead on > our ability to compress the entire SBML document instead (libsbml has > built-in features that let it read and write to compressed documents). > This would actually result in smaller files if the numbers were all written > out than if those number strings were compressed first a la option #1. > This disadvantage of this system is that it makes the files really big, and > therefore harder to read/debug the parts that *aren't* huge arrays of > numbers. > > As far as delimiters go, it seemed to me that the simplest option would be > to allow a ';' delimiter wherever people wanted it, and to remove it for > compression. The order of numbers and their meaning would be precisely > defined in the spec, so that special delimiters (besides the space between > the numbers themselves) were not strictly needed, but could be provided for > readability. > > Also, keep in mind that if the size of the file itself is an issue, the > entire file can be compressed, not just these strings of numbers. The > point of compressing the numbers inside the XML file is (I believe) so that > the *rest* of the file is easier to view manually. > > -Lucian > > > ------------------------------------------------------------------------------ > Comprehensive Server Monitoring with Site24x7. > Monitor 10 servers for $9/Month. > Get alerted through email, SMS, voice calls or mobile push notifications. > Take corrective actions from your mobile device. > http://p.sf.net/sfu/Zoho > _______________________________________________ > sbml-spatial mailing list > sbm...@li... > https://lists.sourceforge.net/lists/listinfo/sbml-spatial > > |
|
From: Lucian S. <luc...@gm...> - 2014-10-15 23:46:45
|
Another issue that hasn't been resolved yet: do people want to be able to create CSGPrimitive surfaces as well as solids? Presumably, they would be used to create 2D compartments within 3D space (such as cell membranes), and/or 1D compartments within 2D space. If so, I would propose a new attribute on CSGPrimitive that indicated whether the shape was to be a "solid" or a "surface". For the 2D primitives, will 'solid' and 'surface' suffice, or would we need other terms, like 'filled' and 'border'? -Lucian |
|
From: Lucian S. <luc...@gm...> - 2014-10-15 23:23:52
|
OK, let me see if I can summarize the issues about compression, and ask people's opinions moving forward: As things stand right now, the spec itself is a little vague on how compression works. This obviously needs to be updated, but we should make sure we know what we want, first. The libsbml implementation of compression (and used by Frank and Jim) works by compressing a *string* of numbers into a format that can be written into an XML file safely (I still don't know which one, but let's assume that this, at least, doesn't need to be changed). This is why Frank is concerned about the delimiter or lack thereof: all spaces, delimiters, etc. are getting compressed along with everything else. The big advantage of this system is that it's implemented. The disadvantage of this system is that it's fairly inefficient, mostly because encoding a number as a string is inefficient to start with. So that's option #1: keep things as they are implemented now, with possible tweaks for delimiters, etc. For option #2, we could compress the arrays of numbers directly, and encode that compression in the same way in the XML. This would have the advantage of being more compressed, but has the disadvantage of not being implemented yet. For option #3, we could ditch compression entirely, and rely instead on our ability to compress the entire SBML document instead (libsbml has built-in features that let it read and write to compressed documents). This would actually result in smaller files if the numbers were all written out than if those number strings were compressed first a la option #1. This disadvantage of this system is that it makes the files really big, and therefore harder to read/debug the parts that *aren't* huge arrays of numbers. As far as delimiters go, it seemed to me that the simplest option would be to allow a ';' delimiter wherever people wanted it, and to remove it for compression. The order of numbers and their meaning would be precisely defined in the spec, so that special delimiters (besides the space between the numbers themselves) were not strictly needed, but could be provided for readability. Also, keep in mind that if the size of the file itself is an issue, the entire file can be compressed, not just these strings of numbers. The point of compressing the numbers inside the XML file is (I believe) so that the *rest* of the file is easier to view manually. -Lucian |