Essential Python questions asked in data engineering interviews at top companies. From core language concepts to data processing patterns.
Python is the lingua franca of data engineering. Interviewers test everything from core language mechanics to practical ETL scripting. These questions cover data structures, list comprehensions, generators, decorators, context managers, Pandas transformations, PySpark UDFs, error handling patterns, and real-world coding challenges you'll face in data pipeline development.
What are traits in Scala, and how are they different from classes?
Explain the difference between args and kwargs in Python.
Explain the difference between shallow copy and deep copy in Python.
What are decorators in Python, and how do they work?
What is the difference between a list and a tuple in Python?
Write a Python function to check if a string is a palindrome.
Write a Python function to find the first non-repeating character in a string.
Explain the difference between a list and a tuple in Python.
Explain the difference between args and kwargs in Python.
How do you handle exceptions in Python? Provide an example.
How do you handle memory management in Python?
What are decorators in Python, and how do they work?
What is the difference between a generator and a list in Python?
What is the difference between a set and a list in Python?
What is the difference between shallow copy and deep copy in Python?
Write a Python function to check if a string is a palindrome.
Write a Python function to find the first non-repeating character in a string.
Write a Python function to find the first non-repeating character in a string.
Write a Python function to find the maximum value in a list without using the built-in max() function.
Amazon Deequ usage and what sort of quality checks are done using it?
Anagram Detection - find all anagrams from a given list of strings
Can you explain the concept of polymorphism and inheritance in Java with examples?
Can you give an example of processing nested JSON data using these functions?
Case Class and StructType Syntax
Check if a number is prime.
Closure Function - explain
Coin Change Problem - minimum number of coins required to make change
Collaborating with cross-functional teams to resolve data quality issues
Compare compression algorithms: Gzip vs Snappy.
Concatenating lists within a range using list comprehensions
Convert a Binary Search Tree (BST) into a skewed tree in either increasing or decreasing order
Convert a sorted array into a Binary Search Tree
Convert the list [1, [2, 3], 4, 5, 6, [7, 8, 9]] to a single list [1, 2, 3, 4, 5, 6, 7, 8, 9].
Count occurrences of elements in a list of tuples using Spark RDDs
Count of Alphabets in String
Create a dictionary with list elements as keys and their occurrences as values.
Create a function to detect anomalies in sales trends using Pandas and NumPy.
Create a Python program to demonstrate the use of set operations (union, intersection).
Create a script to parse and transform a JSON file into a structured CSV.
Describe script implementation and deployment.
Describe Spark's memory management model. How do you handle heap memory overhead issues?
Design a solution to generate unique device names from a list of IoT devices.
Design an algorithm to merge k sorted lists of video streaming data.
Detect a loop in a singly linked list
Develop a Python script to clean data by removing duplicates and handling missing values.
Difference between Stack vs Queue
Differences between Stack, Queue, and Linked List
Differentiate SORT BY, ORDER BY, DISTRIBUTE BY, and CLUSTER BY
Discuss the tech stacks and responsibilities at Morgan Stanley
Discuss your approach to unit testing in your code.
Download the complete interview prep bundle with expert answers. Study offline, on your commute, anywhere.