<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote">On Mon, Jun 9, 2014 at 12:04 PM, Janosch Machowinski <span dir="ltr"><<a href="mailto:Janosch.Machowinski@dfki.de" target="_blank">Janosch.Machowinski@dfki.de</a>></span> wrote:<br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">Am 08.06.2014 23:16, schrieb Sylvain Joyeux:<div class=""><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Oh ... I would also need the size of each stream's sample (including the size of vectors if there are any) ...<br>
<br>
About index loading: the way the index was marshalled needed to be changed (but was not) after the change you made to indexing (i.e. making indexes dense). A 3-line patch improves performance quite a lot already. Alignment is already pretty good on my test file (~4s).<br>
</blockquote></div>
It gets worse with the number of streams. Try a testcase with ~60 streams. There the performance really<br>
drops, and this is the 'reality' test case...</blockquote><div>Created a dataset of one minute with 100 streams. Each stream is at 100Hz, so that's 600k samples. It took 4.6 seconds to generate the index and 0.8 seconds to load the file index (from warm cache, so with probably little I/O overhead).</div>
<div><br></div><div>C++ *is* faster. Of course it is. From what I see, not fast enough to justify the refactoring that you are proposing.</div><div><br></div><div>Would be a lot more interesting to find out why using Vizkit and log control kills performance so much and how we could optimize the typelib parts (which are C++ already !)</div>
<div><br></div><div>Again, you are *not* giving the right measurements. Speed factors and durations are meaningless if we don't know how many samples each stream has, and how long each stream lasts. Just "it is 24x times faster" means nothing. </div>
<div> </div><div>Sylvain</div></div></div></div>