- 
          
- 
                Notifications
    You must be signed in to change notification settings 
- Fork 19
Data Engineering (SQL focus)
Within companies of many sizes, there are large arsenals of datasets connected through shared columns, typically an ID of some kind. These datasets are called relational databases and can range in size, from just a few thousand rows to mega million-row datasets growing daily. From an engineering perspective, SQL is a majorly-used language for data pipelining and ETL (extract, transform, load). From an analyst or data scientist's perspective, SQL is a powerful querying language to extract semi-cleaned data for modeling and analyzing. While Python and R are powerful for uses such as machine learning and data cleaning, they struggle to query from larger databases at the drop of a hat. Additionally, SQL is most used in the industry. Should you pursue any of the traditional data roles (e.g., Data Engineer, Data Scientist, Data Analyst) at larger corporations, you may be expected to be familiar with the basics.
Similar to programming languages, there are multiple dialects of SQL. While they largely share a common structure in how they query data, small nuances exist in their syntax. Look here, here for common differences you may encounter. Fortunately, transitioning from one dialect to another is a lot easier when you get the basics down.
Not much. There are a ton of resources online that can help you troubleshoot querying problems or installation errors.
SQL can be tricky to learn without access to a relational database. Use some of these resources below to build a beginners' foundation:
- PGExercises (PostgreSQL): PGExercises provides a series of questions and explanations built on a single, simple dataset. It's designed for use as a partner to a good book or Postgres' excellent documentation.
- Tutorial Republic: This SQL tutorial series covers all the fundamental concepts of SQL language, such as creating database and tables, using constraints, adding records to a table, selecting records from a table based on different conditions, updating and deleting records in a table, and so on.
While a local copy of SQL is not necessary to learn the basics, you may find a project that requires you to build a relational database. Follow the instructions below to install a DB Browser onto your local computer to help you get started. This browser is compatible with Windows and OS devices.
- Installing a DB Browser for SQLite: DB Browser for SQLite (DB4S) is a high-quality, visual, open-source tool to create, design, and edit database files compatible with SQLite.
TBD
While SQL coding-interviews sound intimidating, they are often used in many data hiring roles. Feel free to explore the resources below to nail that interview. :)
- 
Leetcode Qs: Depending on the role, you may expect to receive a few easy to medium-level qs. More senior roles may require a few difficult questions. 
- 
Data Science Jay: A youtube channel dedicated to breaking down interview test questions w/ industry professionals. 
- 
Alex the Analyst: A how-to on solving leet code questions during an interview. Rather than focus on what you’re doing, this tutorial provides an overview of communicating your thought process. 
TBD
(Some of these issues may be closed or open/in progress.)