Our Software Development Fallacies - Embedded Development Stories #2

While developing something on top of a ready system, is it natural to be myopic to the solutions because of the layered developments? In the last months, I often think about fallacies preventing us, while developing an extreme project. Here’s the story of how we defined our own definitions of “fallacy of structure” and “fallacy of proof”.

Our aim was to integrate an audio over BLE characteristic solution in both mobile and hardware’s embedded side. We were lucky (!) because we were porting the sample project of microcontroller company to our project. They’ve developed a codebase valid for all their evaluation cards, and their structure seems easy to manage. And we benefited it at the beginning.

We run and validated the sample project from the provider. When we attempted porting, we couldn’t run the feature reliably, the codebase was providing faulty outputs. We initially suspected about the stack and heap usage, then wondered some hardware and antenna issues. But in every check we did, we were having a cracked audio output. The code was working, but not properly.

Errors on Open Sourced Public Libraries

To validate the system, we wrote some scripts in Python to run elastically in each desktop environments. (And ChatGPT was really terrible at writing Opus encode and decode codes for high level systems and was okay for C language implementation) The purpose of scripts were to copy WAV or OPUS files from mobile app and embedded side and encode / decode those.

While writing scripts, we encountered 2 Python related problems.

In the upgraded Python 3.12 version, global package installation method is changed we forced global package installation quickly with --break-system-packages⁠ parameter. After the occasion, I wondered, when we might get a stable Python experience. High flexibility is always welcome, but every time I feel newbie when I switched to Python. None other programming language make me feel in this way.

The mostly used package pyogg didn’t come up with the required methods, some files were missing. Thanks to my trials in open sourced projects, I used to check the versions and open issues of the repository. This experience-guess has resulted with an immediate solution from a Github issue record. Luckily, contributors explained the proper installation here.

Would you think you should install the package via pip install git+https://github.com/TeamPyOgg/PyOgg command instead of pip install pyogg⁠ command, for the latest main version you want to use, which documentation is pointing out?

These kinds of issues might happen in open source repositories. Contributors might not find time to distribute changes properly. But what about a company open sources, the sample projects? Let’s follow the main story.

When I built the sample iOS project, I noticed that, the build size of the app was different from the same app published at Apple Store. There are lots of reasons for this difference, but for other projects of the provider, this was not the pattern I was having the same sizes when I built the codebase. As we quickly checkout to the different branches, we noticed that the provider was not having a proper distribution. They were having the release version in other branches rather than main / master. There were lots of branches, it took lots of time to checkout and analyze the branches.

We discovered that, provider has released the app with a different buffer size instead of main branch version. And their calculation for the buffer size was half of the required size. When we put the correct size, the audio was flawless. No heap, stack or hard fault issue!

The case is not concluded till we get the lessons learned here. Here’s the process overview, not to repeat a similar process and not to lose around 2 weeks time for a 1/3-day-worth development item;

We were firstly considering the ease of application according to the sample board, and we were thinking that, there should be a significantly huge mistake. Like copy-paste errors, calling wrong function or emptying buffers early. So we were trying to see that fault, change parameters, make the variables static or not, allocating stack and heap etc.; we were checking the functionality repeatedly after each change. We didn’t consider mathematical operations like buffer allocations etc. Our thinking was like: “These are the working codes, this issues happened when we integrated into our project, so this should be an issue of integration”
Then we wrote validation scripts; in other words, we created dev-kits for software. Because while doing this project, at the early stages, we noticed the importance of having the tool set to validate or falsify the steps we’re taking. We applied this lesson here. This method allowed us to see 5 milliseconds of data is lacking in per 20 milliseconds data length.
After solving the issue, we cleared the unnecessary codes and cleaned the structure. The codebase for this feature became 30-35% of the original size. Even the performance is increased. When we evaluated the case, we saw that starting this process by writing code from scratch would be easier than porting it with original structure. The effort would be less than 3 days.

Fallacy of Structure: Our fallacy here was to believe that the sample project’s structure is a good and flexible system. When we completed the process, we saw that the structure was an overkill for the job. So a working code’s existing structure can be worse than from-scratch code. The criteria should be debuggability and readability of the code.

Fallacy of Proof: When we say “code is working in sample environment” we were comparing misaligned contexts. Our situation was like the famous saying “it’s running on my local environment”. So we should understand the context, implementing practices like backend developers can understand / isolate the environment effect while running in container.

I guess, this might be my one of the strongest engineering management lessons.

Errors on Open Sourced Public Libraries#

Errors on Open Sourced Public Libraries