Wednesday, 10 January 2018

intel bugs, surveys, random shit

Well i've been following the whole "spectre" and "meltdown" saga over the last week or so. Apart from the pretty offensive development of branding (ugh) for a set of mostly unrelated bugs, loads of pro-intel FUD, misdirection and so on, it's been a bit of a fun ride of stupidity.

I haven't seen any patches for my laptop or work computers so far - one because i don't run any av software on microsoft? Ugh, what a fucking mess. I barely use that machine any more anyway, even less so while running that junk.

What's most appalling is that the problem has been sat on for 6 months under "responsible disclosure" while intel continued selling gimped hardware based on bogus benchmarks. And yet the linux and *bsd developers are either only just starting on or working on cleaning up fixes very recently - even intel is still fucking around with microcode updates.

The tech press, apart from The Register (who really broke this story) has been pretty shithouse too. "big nothingburger" for a massive massive security hole that isn't actually fixed by any of these OS kernel patches, merely mitigated. It's a big big serious problem, with big big serious costs, and the handling the disclosure has been an utter disaster - seemingly designed to spread fud about the impacts and costs. Still, if you relied on a single supplier - and particularly a nefarious piece of shit company like Intel - for any serious hardware investment, sucked in I guess? We all have to pay for your stupidity in the end regardless.


I was insomniac[sic] late night and bored so I took the latest stack overflow survey. Apart from some pretty strange questions it's pretty clear why nobody reads my blog - no mention of almost any of the technologies I use day to day. Well at least they had the brains to identify C and C++ as two separate languages I suppose.

Some of the "order these in importance to you" questions I just skipped - I didn't care about the topic at all. A minor problem was a bit of confusion with "framework" "language", etc - where would OpenCL fit if they even included it? No CUDA either for that matter. Plenty of questions about advertising. No I don't use an ad-blocker but I turn off javascript which makes the site oh so much faster and kills most adverts as a bonus.

Sleep has been pretty miserable lately - probably a side-effect of overdoing it a bit, but I think i've gained enough weight again for the sleep apnoea to kick in. One example of overdoing it was the last weekend. Hit the beach Saturday afternoon, dropped in at a mates place - drank a carton of (bloody expensive) piss, went to a party (total strangers apart from him), was up all night whilst everyone else was falling asleep, back to his place to pick my bike up - for a couple more, another dip, and finally dropped by the pub on the way home for a fast (very slow!) one. Barmaid thought i was managing well all things considered but maybe that's because I always look exhausted and feel like shit? Nothing like a 36 hour day to get you tired; and then I was awake after only 4 hours of bloody sleep. For a couple of silly reasons I also got sunburnt to a crisp, oops.

Finally kicking into summer mode!

Tuesday, 2 January 2018

Another year down

Well that was 2017 I guess. Better than 2016 for me at least.

Pretty much recovered from NYE and NYEE drinking but I might take it easy for a couple of days. Waiting for the weather to heat up enough to get back to the beach, did some gardening. The replacement BIOS arrived today so I resurrected my PC too - still seems a bit funny so i'll probably get another one soonish but it works again for now.

I found a laser printer dumped on the side of the road so took it apart for something to do. Surprising amount of screws, custom springs, cogs, and lots and lots of tough plastic. Probably nothing I can do with it but the imaging unit is kinda cool - rotating mirror, a couple of lenses and mirrors. I suppose the more surprising thing is how much technology is placed in what is essentially a throwaway, one-use device. The "consumable" isn't the toner in these things, it's the whole machine; and what's worse is the realisation that the economies of scale mean it couldn't be done any other way. The world is so fucked.

When I wrote the following on the new jjmpeg home page:

Version 3+ is a complete rewrite from earlier versions which have effectively dissapeared from the internet after google code closed down.

I didn't realise just how true it was. Without google code and regular blog updates my projects have basically vanished from the internets - and more so from google search than others. I'm not particularly surprised there is no real interest in the projects themselves but effectively vanishing is a bit weird. Unless you're using some proprietary publishing platform you essentially don't exist.

Well i've got a few weeks off at least anyway; i'll take a break from the computer, watch some cricket, and basically just bum around a bit.

Monday, 25 December 2017

jjmpeg 3.0 released

Put enough together to push out a release of jjmpeg.

It ended up 1700 lines of Java, 2000 lines of C, and 300 lines of Perl.

Apart from supporting the latest version of FFmpeg (at least when I started a couple of weeks ago), it's smaller, cleaner, and more complete than any previous version. Having said that this is essentially just a beta release.

This one is licensed GNU General Public License Version 3 (or later).

I've kinda had enough for the moment so it's a pretty bare home page, but it's there.

Merry XMAS!

Friday, 22 December 2017

damned enums

Been a long week but i'm finally done with work for another year. Although it's mostly a long week because of the late nights working on jjmpeg ...

One of the things I did was fill out/sync up the important enums - AVCodecID, AVPixelFormat, AVSampleFormat, and so on. Previously the pixel format and sample formats were also Java enums - which can be convenient at times and provides some more (albeit much much overvalued) 'type safety'.

This was fairly easy because the PixelFormat was a simple densely ordered C enum so i could map between the two with a simple +-1. Unfortunately someone decided to add a big hole in the middle of it sometime between 0.10 and 3.4, ... so I gave up and just converted it to a class holding static final int's, and to make it consistent I did that with the other enumerations as well. It doesn't really make the classes any harder to use and improves the class size and memory footprint. I just added some methods to access libav*'s metadata information so I can still map between string representations and so on.

I had to add a small compilation stage which extracts the enums from the header files and converts them to a C file which when compiled and run produces the Java source ... this seemed the absolute shortest path to ensuring I got accurate numbers based on the ffmpeg build configuration.

So after about a weeks worth of solid work it's grown somewhat (about 2KLOC Java, and 2KLOC C, counting lines with ";{}") and the TODO list is getting pretty short.

I would like to clean up the exception design a bit - unfortunately i'm just not very good at that (who is?) but i'd like to get better. The build system is clean and simple but could be improved and needs to include the aforementioned enum stuff, a dist target and versioning. Logging would be nice (both redirect ffmpeg to java.util.logging and some for jjmpeg itself). JJMediaWriter? Fix the license headers, add at least a README.

Not today though, today I drink.

Tuesday, 19 December 2017

jjmpeg, jni, javafx

So I guess the mood took me, I somehow ended poking away until the very late morning hours (4am) the last couple of nights hacking on jjmpeg. Just one more small problem to solve ... that never ended. Today I should've been working but i've given up and will write it off, it's nearly xmas break anyway so there's no rush, and i'm ahead of the curve anyway.


I got this ported over and playing video fairly easily, and then went through on a cleanup spree. I removed all the BufferedImage, multi-buffering, and scaling stuff and a few other experiments which never worked. Some api changes allowed me to consolidate more code into a base class, and some changes to AVStream necessitated a different approach to initialising the AVCodecContext (using AVCodecParameters). I made a few other little tweaks on the way.

The reason I removed the BufferedImage code is because I didn't want to pollute it with "platform specific" code. i.e. swing, javafx, etc. I've moved that functionality into a separate namespace (module?).

My first cut just took the BufferedImage code and put it into another class which provides the functionality by taking the current AVFrame from the JJMediaReader video stream. This'll probably do but when working on similar functionality for JavaFX I took a completely different approach - implementing a native PixelReader() so that the native code can decide the best way to write to the buffer. This is perhaps a little more work but is a lot cleaner to use.


jjmpeg1 lets you scale images 'directly' to/from primitive arrays or direct ByteBuffers in addition to AVFrame. Since they have no structure description (size, format), this either has to be passed in to the functions (messy) or stored in the object (also messy). jjmpeg1 used the latter option and for now I simply haven't implemented them.

The PixelReader mentioned above does implement it internally but for code re-use it might make sense to implement them with the structure information as explicit parameters, and use higher level objects such as PixelReader/Writer to track such information. On the other hand the native code has access to more information so it also makes sense to leave it there.

I went a bit further and created a re-usable super-class that does most of the work and toolkit specific routines only have to tweak the invocation. This approach hides libswscale behind another api. The slice conversions don't work properly but they're not necessary.


So far I had public constructors and `finalisers' because otherwise the reflection code failed. That's a bit too ugly (and `dangerous') so I made them private. The reflection code just had to look up the methods and set them Accessible.

    Constructor cc = jtype.getDeclaredConstructor(Long.TYPE);


    return cc.newInstance(p);

Whilst working on JJMediaReader I hit a snag with the issue of ownership. In most cases objects are either created anew and released (or gc'd) by the Java code, or are simply references to data managed elsewhere. I was addressing the latter problem by simply having an empty release() method for the instance, but that isn't flexible enough because some objects are created or referenced the the context determines which.

So I expanded the Java-side object tracking to include a `refer' method in addition to the 'resolve' method. `resolve' either creates a new instance or returns and existing one with a weak-reference object which will invoke the static release method when it gets finalised. `refer' on the other hand does the same thing but uses a different weak-reference object which does nothing.

I then noticed (the rather obvious) that if an object is created, it can't possibly 'go away' from the object tracking if it is still alive; therefore the `resolve' method was doing redundant work. So I created another `create' method which assumes the object is always a new one and simply adds it to the table. It can also do some checking but i'm pretty sure it can't fail ...

If on the other hand the underlying data was reference counted then the `resolve' method would be useful since it would be possible to lookup an existing object despite it being `released'. So i'll keep it in CObject.

As part of this change I also improved CObject in other ways.

I was storing the weak reference to the object itself inside the object so I could implement explicit release and to avoid copying the pointer. I removed that reference and only store the pointer now. The WeakReference it already tracked in a hash table so I just look it up if I need it. This lets me change the jni code to use a field lookup rather than a function call to retrieve it (I doubt it makes much perf difference but I will profile it at some point).

I also had some pretty messy "cross-layer" use of static variables and messy synchronisation code. I moved all map references to outside of the weak reference routine and use a synchronised map for the pointer to object table.

For explicit release I simply call .clear() and .enqueue() on the WeakReference - which seems to do the right thing, and simplifies the release code (at least conceptually) since it always runs on the same thread.

Monday, 18 December 2017

`parallel' streams

I had a task which I thought naturally fitted the Java streams stuff so tried it out. Turns out it isn't so hot for this case.

The task is to load a set of data from files, process the data, and collate the results. It's quite cpu intensive so is a good fit for parallelisation on modern cpus. Queuing theory would suggest the most efficient processing pipeline would be to run each processing task on it's own thread rather than trying to break the tasks up internally.

I tried a couple of different approaches:

  • Files.find().forEach() (serial to compare)
  • Files.find().parallel().collector(custom concurrent collector)
  • Files.find().parallel().flatMap().collect(toList())

The result was a bit pants. At best they utilised 2 whole cores and the total execution times were 1.0x, 0.77x, and 0.76x respectively of the serial case. The machine is some intel laptop with 4 HT cores (i.e. 8x threads).

I thought maybe it just wasn't throwing enough threads at it and stalling on the i/o, so I tried a separate flatMap() stage to just load the data.

  • Files.find().parallel().flatMap(load).flatMap(process).collect(toList())

But that made no difference and basically ran the same as the custom collector implementation.

So I hand-rolled a trivial multi-thread processing graph:

  • I/O x 1: Files.find().forEach(load | queue)
  • Processing x 9: queue | process | outqueue
  • Collator x 1: outqueue | List.add()
With a few sentinel messages to handle finishing off and cleanup.

Result was all 8x "cores" fully utilised and a running time 0.30x of the serial case.

I didn't record the numbers but I also had a different implementation that parallelised parts of the numerical calculation instead. Also using streams via IntStream.range().parallel() (depending on the problem size). Surprisingly this had much better CPU utilisation (5x cores?) and improved runtime. It's surprising because that is a much finer-grained concurrency with higher overheads and not applied to the full calculation.

I've delved into the stream implementation a bit trying to understand how to implement my own Spliterators and whatnot, and it's an extraordinarily large amount of code for these rather middling results.

Not that it isn't a difficult problem to solve in a general way; the stream "executor" doesn't know that I have tasks and i/o which are slow and with latency compared to many small cpu-bound tasks which it seems to be tuned for.

Still a bit disappointing.

Sunday, 17 December 2017

jjmpeg & stuff

Well for whatever reason I got stuck into redoing jjmpeg and seem to have written most of the code (90%?) after a couple of weekends. It was mostly mandraulic and a bit tedious but somehow surprisingly relaxing and engaging; a short stint of unchallenging work can be a nice change. A couple of features are still missing but the main core is done.

Unfortunately my hope that the ffmpeg api was more bindable didn't really pan out but it isn't really any worse either. Some of the nastiest stuff doesn't really need to be dealt with fortunately.

I transformed most of the getters and setters into a small number of simple macros, and thus that part is only about as much work as the previous implementation despite not needing a separate compilation stage. I split most of the objects into separate files to make them simpler to maintain and added some table-based initialisation helpers to reduce the source lines and code footprint.

It's pretty small - counting `;' there's only 750 lines of C and 471 lines of Java sources. The 0.x version has 800 lines of C and 900 lines of Java, a big portion of which is generated from an 800 line (rather unmaintainable) Perl script. And the biggest reduction is the compiled size, the jar shrank from 274KB to 73KB, with only a modest increase from 55KB to 71KB in the (stripped) shared library size (although the latter doesn't include the dvb or utility classes).

There's still a lot of work to do though, I still need to test anything actually works and port over the i/o classes and enum tables at the least, and a few more things probably. This is the boring stuff so it'll depend on my mood.

Fuck PCs

In other news I finally killed my PC - I tried one more time to play with the BIOS and after a few updates it got so unstable it just crashed during an update and bricked the motherboard. Blah. I discovered I could order a new BIOS rom so i've done that and i'll see if i can recover it, otherwise I might get another mobo if I can still get AM2+ boards here, or just get another machine. I'll probably look into the latter anyway as it's always been a bit of a hassle (despite working flawlessly when it does and it's a very nice small machine.