Bit of a weird observation: "Seeing a new computing paradigm coming out of Data Science / Observability"

Justin@lemmy.jlh.name · edit-2 7 months ago

Bit of a weird observation: "Seeing a new computing paradigm coming out of Data Science / Observability"

lysdexic@programming.dev · 7 months ago

Perhaps I’m being dense and coffee hasn’t kicked in yet, but I fail to see where is this new computing paradigm that’s mentioned in the title.

From their inception, computers have been used to plug in sensors, collect their values, and use them to compute stuff and things. For decades each and every single consumer-grade laptop has adaptive active cooling, which means spinning fans and throttling down CPUs when sensors report values over a threshold. One of the most basic aspects of programming is checking if a memory allocation was successful, and otherwise handle an out-of-memory scenario. Updating app states when network connections go up or down is also a very basic feature. Concepts like retries, jitter, exponential back off have become basic features provided by dedicated modules. From the start Docker provided support for health checks, which is basically am endpoint designed to be probed periodically. There are also canary tests to check if services are reachable and usable.

These exist for decades. This stuff has been done in production software since the 90s.

Where’s the novelty?

Justin@lemmy.jlh.name · edit-2 7 months ago

Those are good points.

Control loops have of course always existed in industrial computing, but I think it’s exceptional how common they are now in modern servers and PCs.

Thats a good point about memory allocation. I guess a lot of syscalls could be considered to be part of this data-centric self-adaptation mode of operation that I’m trying to describe.

I think retries and exponential backoff are more of a single-threaded error-handling operation, I think that’s different from the operations I’m describing, which instead involves multiple services communicating together to adapt to changing conditions.

As far as I can tell, Docker didn’t add healthchecks until 1.12 in 2016. I do think Docker healthchecks are a good example of the service orchestration design that has become very popular recently, though.

To be fair, I didn’t start seriously programming until around 2017, so maybe I’m missing some of the history that shows that this sort of data-centric adaptation was popular prior to 2010.

bigredgiraffe@lemmy.world · 7 months ago

I think what this person is saying is that systems and services have been monitored for metrics and logs for a long time, I know I have been doing it for more than 20 years across many OS, hardware platform, and software stack. The tools and depth of the integrations have definitely changed and gotten way better and more sophisticated but I definitely made systems that monitored and healed themselves to varying levels of efficiency since at least using Nagios in 2003 (I’m getting Perl PTSD flashbacks now hah).

One thing that has definitely gotten better in the last 5 or so years though is code level instrumentation and tracing as well as the higher level correlation tools. I have also seen more developers and vendors way more willing to implement monitoring features in their code from the beginning leading to more data and less duct tape and guessing which has been FANTASTIC.

Anyway, great post though, the monitoring arena is definitely way more awesome than ever before these days that is for sure.

zik@lemmy.world · edit-2 7 months ago

I think the timeline’s a bit off here.

OP describes how primitive computing was in the 80s and 90s, and speaks of a number of developments which appeared “leading up to the year 2000”. Let me give examples of all of these developments which were actually from the 1970s or earlier:

The VAX-11/780 was introduced in 1977, pretty much introduced the concept of a modern MMU and memory model - although there were plenty of precursors. They were very popular and widespread.
Lisp’s been around since 1958. It (and other languages) used memory managed runtimes similar in concept to today’s ones.
IBM’s VM/370 OS introduced virtual machines on IBM mainframes in 1972. They were an integral part of the OS and CPU architecture, probably more so than current VMs which are kind of tacked on as an afterthought.
Modular programming languages were a big topic in this era. One that comes to mind is Modula-2 which was first introduced in 1977, but much programming language development at the time was focused on modularity and code reuse.
And JITs date back to 1960.

My point is that I think these advancements were made a lot earlier than OP’s saying. Sure, some of them took a while to spread but we pretty much started the 80s with all of this already in place.

slacktoid@lemmy.ml · 7 months ago

I think theres a real use case for it. While the AI hype train has kinda ruined it for a lot of people but AI has always been a tool to optimize a problem.

JakenVeina@lemm.ee · edit-2 7 months ago

C#/.NET makes HUGE usage of these kinds of internal self-optimizations. Just this year, in particular, the team made some pretty big expansions on the types and scopes of JIT optimizations that the runtime can perform. Article, if you’re curious.

Justin@lemmy.jlh.name · 7 months ago

JITs and branch prediction are probably good examples of the kind of computing I’m describing.

RonSijm@programming.dev · 7 months ago

I don’t know if this is a relatively “new” computing paradigm, though if you compare it to the pre-2010 area, its pretty much the standard for bigger applications. And I think it’s very much tied in with the Move to Cloud Computing paradigm.

In the good old days everyone just had their own servers running somewhere, so what are you going to do when its super busy on your platform? Add a new server for a couple of days? If you have a new server anyways, you’d just permanently add it to the network.

With cloud computing, as you mentioned, there’s Service orchestration like kubernetes, auto-scaling of bare-metal machines, and Serverless Applications that just keep track of usage and allow you to very easily temporary add more power based on demand, and upscale your infra for the time that it’s needed.

If you start getting into paradigms like that, you might end up with 100s of services running at the same time (multiple copies of the same services for load balancing, or edge-locationing etc) - Then you also don’t want to put cross-cutting like logging and analytics hard-coded in every service like you’d potentially do in a monolith. And you need those kinda metrics to see that everything is still running healthy, and to automatically kill unhealthy services to replace them with new ones, etc

Justin@lemmy.jlh.name · 6 months ago

That’s a really good point. I guess it ties into the “cattle, not pets” mindset. It’s pretty easy to tell if your pet is sick, but you need to have systems in place to be able to tell if your cattle are sick.

ohlaph@lemmy.world · 7 months ago

Thank you.

robinm@programming.dev · 7 months ago

Interesting take but I think you are right. It’s indeed critical to know how you product is used nowadays.

sine@programming.dev · 7 months ago

LLMs in particular seem well fitted to extracting semantically correct insights from unstructured data. When it comes to observability we’re in a better spot; since we have discrete structured data, which makes it easy to build rules and logic on top of it. I don’t think this kind of tooling will benefit much from recent advances. If anybody has anything worth being shown I’d love to check it out.