• 2 Posts
  • 35 Comments
Joined 2 years ago
cake
Cake day: June 15th, 2023

help-circle

  • This is about thew new starter cost.

    When a developer joins a team, they will not be as productive as they have to learn the code, frameworks, libraries, the project purpose, the tooling, etc… Often this impacts other members of the team lowering the entire teams productivity.

    When you use productivity tracking (e.g. things like capacity planning) you will see the teams performance drop and it will take time for it to exceed the previous measured performance. This is the cost of adding a new starter.

    So if it takes 6 weeks for a new starter to increase overall team producitivty then planning someone on a project for 4 weeks is pointless since the team will have a higher delivery rate without the extra person. This is typically why an organsation loses its ability to migrate staff between projects.

    Code formating affects the layout of the code and our brains do all sorts of tricks around pattern recognition, so if your code formatting rules are too different a someone migrating between projects has to spend time looking for code and retraining their brain.

    Its an additional barrier and a one within an organisations skills to remove (by forcing a common code standard).



  • Python is unique in formatting forms part of the syntax, every language has linters but its far more common for orgs to tweak the default rules .

    For example Java has Checkstyle. The default rules ‘sun checks’ give a line length of 80, tabs are 4 spaces and everything is placed on a new line.

    Junior devs inevitably want to trash the line length (honestly on 1080p monitors, 120 makes sense,).

    There is always a new line/same line discussion (everyone perfers same line but there is always one die hard new line person).

    The tab width discussion always has one junior dev complain that “tabs are better”, as someone who started development on Visual Studio 6 where half the team double spaced, the other half used tabs. Those people get a lecture from me on how we can convert tabs to spaces but not the inverse so it will always be spaces if I am near.

    With Checkstyle you upload the rule file as an artifact into your M2 repository. Then you can pull it down as a dependency when the checkstyle plugin runs.


  • I avoid any company that requires a software test before the interview.

    I worked for a company that introduced them after I joined, I collected evidence all of the companies top performers wouldn’t have joined since we all had multiple offers and having to do the test would put people off applying. The scores from it didn’t correlate with interview results so it was being ignored by everyone. Still took 2 years to get rid of it.

    The best place used STAR (Situation Task Action Result) based interviews. The goal was to ask questions until you got 2 stars.

    I thought these were great because it was more varied and conversational but there was a comparable consistency accross interviewers.

    You would inevitably get references to past work and you switch to asking a few questions about that. Since it was around a situation you would get more complete technical explanations (e.g. on that project I wrote an X and Y was really challenging because of Z).

    I loved asking “Tell me about something your really proud off”. Even a nervous junior would start opening up after that question.

    After an hour interview you would end up with enough information you could compare them against the company gradings (junior, senior, etc…).

    This was important because it changed the attitude of the interview. It wasn’t a case of if the candidate would be a good senior dev for project X, but an assessment of the candidate. If they came out as a lead and we had a lead role, lets offer them that.




  • Python’s public API changes subtly, so minor changes in Python version can lead to massive changes in the version of dependencies you use.

    A few years ago we developed a script to update Cassandra on Python 2.7.Y. Production environment used Python 2.7.X (it was 5 patch releases earlier).

    This completely changed the cassandra library version. We had to go back 15 patch releases which annoying resulting in a breaking change in the Cassandra libraries API and wouldn’t work on the dev environments Python.

    Python 3 hasn’t solved this, 2 years ago I was asked to look at a number of Machine Learning projects running in docker. Upgrading Python from 3.4 to 3.8 had a huge effect on dependencies and figuring out the right combination was a huge pain.

    This is a solved problem in Java, Node.js has the same weakness but their changes to language spec are additive so old code runs on new releases (just not the inverse). Ruby has exactly the same issues as Python


  • This advice isn’t grounded in reality.

    Management normally defines ways to track and judge itself, these are typically called Key Performance Indicators.

    KPI’s are normally things like contract value growth, new contracts signed, profit margin, etc…

    So if the project manager is meeting or exceeding their KPI’s and you walk up to their boss telling them the PM is failing as basic job functions, the boss won’t care.

    This is because the boss might have set the KPI’s or the boss might also be judged on them. In either situation its to the bosses advantage to ignore you.

    The boss will only care if there is a KPI you can demonstrate the PM failing to meet.

    Every person/group will have various incentives and motivations. To affect change you have to understand what they are.


  • A project manager has responsibility for delivery of a project but they typically lack domain specific knowledge. As a result they can’t directly deliver something, merely ask subject matter experts for advice and facilitate a team to deliver.

    Most PM’s cope with the stress of this position poorly.

    This cartoon is an example of micro management (a common coping mechanisim), the manager has involved themselves in the low level decisions because that gives a sense of control. If a technical team then tell them its a bad decison the team are effectively attacking their coping mechanisim.

    The solution isn’t to tell them their technical idea is terrible, when you’ve fallen down this rabbit hole you have to treat the PM as a stakeholder. They are someone you have to manage, so a common solution is to give them confidence there is a path to delivery, a way to track and understand it.


  • SpaceX are launching 26-52 satellites at a time and have sustained 3 launches a week for most of the year.

    The satellites are in a Low Earth Orbit, without constant thrust, atmospheric drag will force them to re enter earths atmosphere within a few months. This means they aren’t adding to junk in space.

    Unlike Nasa, ULA, Arriannespace, RoscosMos, etc… SpaceX have always performed 2nd Stage Deorbit burns, so they aren’t adding to Space junk by launching either.

    The Low Earth Orbit is to ensure low latency and the need for constant thrust means the satellites have a short life expectancy by design. That is why SpaceX fought to keep the satellites as cheap as possible (e.g. $250k)

    First stage booster reuse and fairing reuse means the majority of the launch cost is the second stage ($15 million).

    The whole lot is privately funded



  • If you read the reports…

    Normally JPL outsource their Mars mission hardware to Lockheed Martin. For some reason they have decided to do Mars Sample Return in house. The reports argue JPL hasn’t built the necessary in house experience and should have worked with LM.

    Secondly JPL is suffering a staff shortage which is affecting other projects and the Mars Sample Return is making the problem worse.

    Lastly if an organisation stops performing an action it “forgets” how to do it. You can rebuild the capability but it takes time.

    A team arbitrary declaring they are experts and suddenly decideding they will do it is one that will have to relearn skills/knowledge on a big expensive high profile project. The project will either fail (and be declared a success) or masses of money will be spent to compensate for the teams learning.

    Either situation is not ideal


  • The GAO has performed an annual review of the Space Launch System every year since 2014 and switched to reviewing the Artemis program in 2019.

    Each year the GAO points out Nasa isn’t tracking any costs and Nasa argues with the GAO about the costs they assign. Then the GAO points out Nasa has no concrete plan to reduce costs, Nasa then goes nu’uh (see the articles cost reduction “objectives”).

    The last two reports have focused on the RS-25 engine, last time the GAO was unhappy because an engine cost Nasa $100 million and Nasa had just granted a development contract to reduce the cost of the engine.

    However if you took the headline cost of the contract and split it over planned engines it was greater than the desired cost savings. Nasa response was development costs don’t count.

    Congress reviews GAO reports and decides to give SLS more money.


  • The other person was just wrong.

    Large scale Hydrogen generation isn’t generated in a fossil free way, Hydrogen can be generated is a green way but the infrastructure isn’t there to support SLS.

    Hydrogen is high ISP (miles per gallon) by rubbish thrust (engine torque).

    This means SLS only works with Solid Rocket Boosters, these are highly toxic and release green house contributing material into the upper atmosphere. I suspect you would find Falcon 9/Starship are less polluting as a result.

    Lastly the person implies SLS could be fueled by space sources (e.g. the moon).

    SLS is a 2.5 stage rocket, the boosters are ditched in Earths Atmosphere and the first stage ditched at the edge of space. The current second stage doesn’t quite make low earth orbit.

    So someone would have to mine materials on the moon and ship them back. This would be far more expensive than producing hydrogen on Earth.

    Hydrogen on the moon makes sense if your in lunar orbit, not from Earth.


  • Do not mix tabs and spaces.

    Its impossible to automate checking that tabs were only used for indentation and spacing for precise alignment. So you then take on a burden of manually checking

    You end up with the issue where someone didn’t realise and space idented or anouther person used tabs for precise alignment and people forget to check the whitespace characters in review and it ends up going inconsistent and becoming a huge pile of technical debt to fix.

    Use only one, you can automate enforcement and ensure the code renders consistency.



  • Years ago there was no way to share IDE settings between developers.

    You ended up with some developers choosing a tab width of 2 spaces, some choosing 4 spaces and as there was no linting enforcement some people using 2-4 spaces depending on their IDE settings.

    This resulted in an unreadable mess as stuff was idented to all sorts of random levels.

    It doesn’t matter if you use tabs or spaces as long as only one type is consistently used within a project.

    Spaces tends to win because inevitably there are times you need to use spaces and so its difficult to ensure a project only uses tabs for identation.

    IDE’s support converting tabs into spaces based on tab width and code formatting will ensure correct indentation. You can now have centralised IDE settings so everyone gets the same setup.

    Honestly 99% of people don’t care about formatting (they only care when consistency isn’t enforced and code is hard to read), there is always one person who wants a 60 charracter line width or only tabs or double new lined parathensis. Who then sucks up huge amounts of the team time arguing their thing is a must while they code in emacs, unlike the rest of the team using an actual ide.


  • I am actually arguing for a stable ABI.

    The few times I have had to compile out of tree drivers for the linux kernel its usually failed because the ABI has changed.

    Each time I have looked into it, I found code churn, e.g. changing an enum to a char (or the other way) or messing with the parameter order.

    If I was empire of the world, the linux kernel would be built using conan.io, with device trees pulling down drivers as dependencies.

    The Linux ABI Headers would move out into their own seperately managed project. Which is released and managed at its own rate. Subsystem maintainers would have to raise pull requests to change the ABI and changing a parameter from enum to char because you prefer chars wouldn’t be good enough.

    Each subsystem would be its own “project” and with a logical repository structure (e.g. intel and amd gpu drivers don’t share code so why would they be in the same repo?) And built against the appropriate ABI version with each repository released at its own rate.

    Unsupported drivers would then be forked into their own repositories. This simplifies depreciation since its external to the supported drivers and doesn’t need to be refactored or maintained. If distributions can build them and want to include the driver they can.

    Linus job would be to maintain the core kernel, device trees and ABI projects and provide a bill of materials for a selection of linux kernel/abi/drivers version which are supported.

    Lastly since every driver is a descrete buildable component, it would make it far easier for distributions to check if the driver is compatible (e.g. change a dependency version and build) with the kernel ABI they are using and provide new drivers with the build.

    None of this will ever happen. C/C++ developers loath dependency management and people can ve stringly attached to mono repos for some reason.


  • The linux kernel is very old school in how it is run and originally a big part of the DevSecOps movement was removing a lot of manual overhead.

    Moving on to something like Gitea (codeberg) would give you a better diff view and is quicker/easier than posting a patch to a mailing list.

    The branching model of the kernel is something people write up on paper that looks great (much like Gitflow) but is really time consuming to manage. Moving to feature branch workflow and creating a release branches as part of the release process allows a ton of things to be automated and simplified.

    Similarly file systems aren’t really device specific, so you could build system tests for them for benchmarking and standard use cases.

    Setting up a CI to perform smoke testing and linting, is fairly standard.

    Its really easy to setup a CI to trigger when a new branch/pr is created/updated, this means review becomes reduced to checking business logic which makes reviews really quick and easy.

    Similarly moving on to a decent issue tracker, Jira’s support for Epic’s/stories/tasks/capabilities and its linking ability is a huge simplifier for long term planning.

    You can do things like define OKR’s and then attach Epics to them and Stories/tasks to epics which lets you track progress to goals.

    You can use issues the way the linux community currently uses mailing lists.

    Combined with a Kanban board for tracking, progress of tickets. You remove a ton of pain.

    Although open source issue trackers are missing the key productivity enablers of Jira, which makes these improvements hard to realise.

    The issue is people, the linux kernel maintainers have been working one way for decades. Getting them to adopt new tools will be heavily resisted, same with changing how they work.

    Its like everyone outside, knows a breaking the ABI definition from the sub system implementation would create a far more stable ABI which would solve a bunch of issues and allow change when needed, except no one in the kernel will entertain the idea.


  • During the pandemic I had some unoccupied python graduates I wanted to teach data engineering to.

    Initially I had them implement REST wrappers around Apache OpenNLP and SpaCy and then compare the results of random data sets (project Gutenberg, sharepoint, etc…).

    I ended up stealing a grad data scientist because we couldn’t find a difference (while there was a difference in confidence, the actual matches were identical).

    SpaCy required 1vCPU and 12GiB of RAM to produce the same result as OpenNLP that was running on 0.5 vCPU and 4.5 GiB of RAM.

    2 grads were assigned a Spring Boot/Camel/OpenNLP stack and 2 a Spacy/Flask application. It took both groups 4 weeks to get a working result.

    The team slowly acquired lockdown staff so I introduced Minio/RabbitMQ/Nifi/Hadoop/Express/React and then different file types (not raw UTF-8, but what about doc, pdf, etc…) for NLP pipelines. They built a fairly complex NLP processing system with a data exploration UI.

    I figured I had a group to help me figure out Python best approach in the space, but Python limitations just lead to stuff like needing a Kubernetes volume to host data.

    Conversely none of the data scientists we acquired were willing to code in anything but Python.

    I tried arguing in my company of the time there was a huge unsolved bit of market there (e.g. MLOP’s)

    Alas unless you can show profit on the first customer no business would invest. Which is why I am trying to start a business.