<div dir="ltr"><div class="gmail_extra"><div class="gmail_quote"><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex"><div class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
Created a dataset of one minute with 100 streams. Each stream is at 100Hz, so that's 600k samples. It took 4.6 seconds to generate the index and 0.8 seconds to load the file index (from warm cache, so with probably little I/O overhead).<br>
</blockquote></div>
How long did the stream alignment take ? This is the part were usually the problem is, as you can't get better than<br>
O((log n)*s) there, were n is the number of streams and s the amount of samples.</blockquote><div>??? What are you talking about ? This is only the asymptotic curve. The alignment takes 4.6 seconds.</div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class=""><br>
<blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<br>
C++ *is* faster. Of course it is. From what I see, not fast enough to justify the refactoring that you are proposing.<br>
</blockquote></div>
Ohh yes, it does. Recently I did a log of localization debugging. You can't jump data in this case (and a lot of other usecases too) which means you have to replay the whole logstream. If the replay is double as fast, it means you need half<br>
the time for debugging. So in my eyes it is 100% worth the effort.</blockquote><div>Except that making twice as fast the part that is currently taking 10% of the replay time only will make the overall process 5% faster. Even making it 100 times faster will only save 9%. This is from what I see what you are attempting, as what takes the most time is I/O and typelib demarshalling. </div>
<div><br></div><div>In other words: you are attempting to optimize something without having done any profiling. This is a cardinal sin.</div><div> </div><div><br></div><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
<div class=""><blockquote class="gmail_quote" style="margin:0 0 0 .8ex;border-left:1px #ccc solid;padding-left:1ex">
Again, you are *not* giving the right measurements. Speed factors and durations are meaningless if we don't know how many samples each stream has, and how long each stream lasts. Just "it is 24x times faster" means nothing.<br>
</blockquote></div>
You got the C++ implementation, just run multiIndexTester on your testdata and compare the results.<br></blockquote><div><br></div><div>Sylvain </div></div></div></div>