Lion ponderings

7 years after github scored its buyout from Microsoft, they're finally starting to drop the hammer. It will soon be joining skype, goog code, sourceforge, linuxbox. Lions didn't adopt it until the 2014 time frame. It was just 1 broken hvirtual placeholder. Now, it's manely CAD files. Since git doesn't support binary diffs, it's terrabytes of the same files.

---------------------------------------------------------------------------------------------------------------------

An ill fated attempt to get ffmpeg TOC generation to read asynchronously encountered the reality that ffmpeg uses its own avio library & doesn't read blocks. The MKV parser does a lot of avio_skip, avio_r8, avio_rb32, avio_rb64, avio_seek, avio_read. They call into aviobuf.c which manages a buffer. There could be an override of the C library functions, with a lot of effort.

The C library was last overwritten in renderfarmfsclient.C. It would have to selectively create a spooler thread based on the filename. So there is a chance if testing showed it was reading very small blocks & they were all sequential.

ffmpeg TOC generation has become a pain as the hard drive has filled, slowed down, & bitrates have gotten higher.

When lions encounter massive investments in useless features from 25 years ago, like the channel editor, the renderfarm, the batch recording, & the batch rendering, they're reminded these were actually big use cases in a long lost set of circumstances. The renderfarm was essential when video compression was super slow. Batch recording was essential when the only decent quality video was over the air analog.

Helas, batch recording is now definitely a pain & prone to causing file overwrites.

Now, ffmpeg TOC generation is a big use case. Verified it calls a lot of read with 32768-131072 blocks & lseek in the beginning. Some rough profiling showed it might get a 15% speed boost for high bitrate content & 20% for low bit rate content. The trick is finding the file descriptor the TOC builder is using.

Audio indexes were originally built by an asynchronous file reader but not anymore. It only accelerated raw PCM.

Made a rough spooling implementation & tested with the newest hard drive. Unmounted the hard drive to clear the cache.

A 5.9 gig 4.4 megbit movie indexed in 1m2s with spooling & 1m4s without spooling.

A 2.9gig 2megbit movie indexed in 42 seconds with spooling & 41 seconds without spooling. With caching, it was 42 & 40.

The profiler accumulated 1000 samples per data point so the read times differed by microseconds. Time spent in reads was 12ms / 1000 without spooling & 5ms / 1000 with spooling. The processing time was roughly 500ms / 1000 in each mode.

With the oldest hard drive, it indexed a 2.9 gig file in 47s with the spooler & 47s without the spooler. Read times were 20-40ms / 1000 without spooling & still in the high 4ms / 1000 with spooling.

Quite disappointing, but confirming why it was dropped for audio. Modern hard drives are so fast, they're just not holding back anything. Reads would have to be much closer to the processing time. Young lion never profiled the audio indexing when it was spooled. It might have done better on the Quantum bigfoot, reading 5 megabytes/sec while the 166Mhz Cyrix 686 crunched away converting int to double.

22 year old lion didn't think this software business would lead anywhere, so he was really laying on the unnecessary features to make something more showy than functional.

------------------------------------------------------------------------------------------------------------------------

Finally supporting a header size with libsndfile was the next one. That library is so old & abandoned, it's an unlikely feature. You're creating a SF_VIRTUAL_IO struct & offsetting all the seeks by the header size. The vio_data points to the class. It only applies to read mode in PCM mode. The write mode never supported a header size.

--------------------------------------------------------------------------------------------------------------------

Finally we come to titler & text box extents with empty linefeeds before the text. Seems to be a regression from UTF support but it wasn't. It was discarding the 1st linefeed in the text. Then the leading linefeeds were getting dropped somewhere in the keyframe storage.

It seems FileXML::read_text drops all the leading newlines in an anonymous block of text. That was an important feature so leading newlines just have to be converted to &#xA. That was the easiest solution even though the XML would be illegible.

It still couldn't render the leading & trailing linefeeds. It neither draws leading nor trailing linefeeds in the text, but it's something lions intuitively try to do when Y positioning is too cumbersome. For now, the big old TitleMain::get_total_extents function kludges a fixed line spacing when it encounters a linefeed.

Somehow, leading & trailing spaces always managed to work. It depends on the existence of a glyph for the space character. Characters with no glyphs like linefeeds need kludges.

--------------------------------------------------------------------------------------------------------------------

Oh how prescient Cinelerra's playback speed features were. Fortunately, 4x doesn't require paying. The other editors have always supported variable speed with constant pitch. It took lions 25 years entrenched in audio tape emulation to get there, but the others all require incrementally changing speed with JKL.

The same thing happened during silent movies, with theaters speeding up playback in order to pack in more showings.

-----------------------------------------------------------------------------------------------------------

Plausible that AI replaces all human readable code with ponderously complex constructions which can only be read by AI. We already saw the growth in online documentation doing away with constructs that were easily memorized after 2000.

The traditional way of printing large integers was easily memorized typecasts & short formatting strings from a world without online documentation. The AI method relies on the syntax no-one remembers.

Search This Blog

Lion ponderings

Comments

Post a Comment

Popular posts from this blog

snow white