Wednesday, November 30, 2022

ExTwitter Tech Lead Says Platform's Infrastructure Can Sustain Engineering Layoffs

ExTwitter Tech Lead Says Platform's Infrastructure Can Sustain Engineering Layoffs

Years of infrastructure planning prevented Twitter's system from going down anytime soon, according to a senior engineer who left the platform in August.

Matthew Tejo, a former Site Reliability Engineer (SRE) at Twitter, explained on his blog that he spent most of his career at the company automating systems when possible and planning for disasters when it did. was not possible, and that the platform can continue to operate provided there are no significant changes to the existing system.

An explanation of Twitter's infrastructure design comes after tech insiders questioned whether Twitter would be able to move forward after new CEO Elon Musk laid off most of its tech staff.

Techo says Twitter relies heavily on caching to manage traffic, reduce site-wide response rates, and significantly reduce overall server costs.

This cache then runs on the Aurora framework, which is part of the Apache Mesos open source project. While Aurora deploys applications to servers, Mesos aggregates servers and kills them in the event of an outage.

Since Mesos can't detect all hardware issues, Twitter relies on manual monitoring by the IT department to check for issues like hard drive failure. If found, repairers are automatically dispatched to the data center to fix the problem.

Twitter's small remaining workforce - estimated at just 20% of its peak after the latest round of layoffs - could prove problematic as fewer engineers now have to do the same amount of work.

However, Tejo also revealed that at any given time, Twitter has two data centers operating simultaneously, capable of handling catastrophic site outages, each capable of running all core services on the platform. This means that in the worst case, Twitter constantly loads at 200%, making it much less likely to die due to lack of server resources.

Twitter also uses specific tools to ensure that servers are distributed safely as soon as they are assigned: "These tools ensure that the team does not have too many physical servers in a rack and that everything is distributed in a way that that there is no problem in case of failure,” says Tejo.

Unknown infrastructure issues or changes introduced in Elon Musk's wave of new Twitter CEO changes can further destabilize the platform. However, given the effort to make Twitter at least partially self-sufficient, Tejo admits that "I think there are flaws lurking somewhere."

Shortly after Musk's takeover , Reuters reported that Musk planned to cut $1 billion worth of infrastructure in the coming months. The source, who spoke to reporters, said spending $1.5 million to $3 million on servers and cloud was deemed unnecessary, suggesting the excessive security redundancies created by Tejo may not be possible. durable.

"I don't want to implement a system or service that comes together under extreme pressure. Standards will break. Data will be lost," xDesign CPTO Jeff Watkins told IT Pro .

“Worse, there will probably be a massive brain drain. As a result, the remaining teams are unlikely to be Team A.

“So the impact on user data could be negative, but tweeters not only use Twitter through the website and mobile app, they also have some cool side effects through the API, like tracking chat times. 'system shutdown (tweet says every business). Destabilizing what is almost the social shadow of IT could have unintended consequences on a global scale."

Related Resources

Total Economic Impact™ IBM Spectrum Virtualize

Savings and business benefits of storage created with IBM Spectrum Virtualize

Blue Numbers on White Background - Total Economic Impact™ IBM Spectrum Virtualize - IBM White Paper Free download

It's not yet clear whether Twitter's infrastructure will support the platform long-term with fewer engineers working on it. Despite the excess server resources available, software bugs are common and require skilled engineers to fix them to keep the service running smoothly.

Twitter has been in a period of rapid change since Elon Musk acquired the platform on October 27. In the weeks that followed, a number of company executives left their posts, half of the workforce was laid off overnight amid the chaos where employees were blocked from their e e-mails, and a slew of remaining employees responded to Musk's demands to work harder. terms of resignation. "rebellion".

On Monday, The Verge reported that Musk was cutting benefits for Twitter employees and company benefits for childcare, home internet and health care. The same report says employees must now provide managers with a full summary of their work at the end of each week.

Selected resource

2022 Multi-Cloud Status Report

What are the main multicloud motivations of decision-makers and what are the main issues?

Free download

Total Economic Impact™ from IBM Robotic Process Automation

Cost savings and business benefits through robotic process automation

Free download

Multi-cloud data integration for data leaders

Holistic data networking approach for multi-cloud integration

Free download

Strong MLO and AI for Data Leaders

A Data Structure Approach for MLOps and Trusted AI

Free download

How Nigeria Can Finance Its Development Priorities | Integrated national funding structure

Labels: ,

0 Comments:

Post a Comment

Subscribe to Post Comments [Atom]

<< Home