NCR Work Term Brief

Overview

I joined NCR Corporation as a Systems Software Engineering Intern, based out of Waterloo, Ontario. I got to work on a lot of high-impact, exciting work related to server software and fullstack development

Server Software

My manager introduced me to an interesting problem at NCR, the company has tens of thousands of virtual machines, but there did not exist any way for us to extract metrics from these machines. We wanted to know VM-metrics such as uptime, CPU load, etc. We also wanted to be able to aggregate this data, and draw correlations.

My solution to this was a highly-scalable and fault-tolerant system service written in Python, with Pandas. This service would connect to a VMWare interface and it would query the tens of thousands of VMs that exist within the company. There were many measure put into place to make sure that this piece of software would be reliable and efficient, such as connection caching and thorough end-to-end testing.

This service would run every day and would both store retrieved metrics historically in a MongoDB database, and would generate a spreadsheet for auditing teams to review if needed. Because of this project, various teams at NCR were able to obtain key metrics about the virtual machine infrastructure inside NCR and make key business decisions.

Full-stack Development

This co-op term was an interesting one, after I finished my server software project, I was approached by one of the general managers in the engineering org and was presented with this problem: NCR has many different webpages that are owned by different stakeholders, the infrastructure team provides these stakeholders with SSL certificates to use, but there doesn't exist any way for us to ensure that these stakeholders will keep their certificates updated, especially if they don't set some kind of reminder for themselves.

The solution we decided on for this problem was to create a web-portal where stakeholders can enter a website, and their contact information. After the information is entered, a service will continously monitor the SSL certificates for the submitted websites, and reach out to the contacts if any of them are reaching expiry time.

This was multi-faceted, the web-portal was implemented with React and Express, and the backend service was implemented using NodeJS and MongoDB. We still had one more problem to solve; what if the website to be monitored is behind a private network or a firewall? To solve this, I brushed up on my communication protocols and created an incredibly lightweight python agent that can be run on the servers that host internal/firewalled websites. The python agent would securely expose a port for our backend service to connect to using an intermediary network.