Hello Manish!
I echo what our teaching staff Nawang has said. Thank you for reaching out.
I believe that article was not behind a paywall when we added it. Nonetheless, here is a summary to make sure you don’t miss out:
Data Structures in Data Science: A Primer
Why are Data Structures Essential in Data Science?
Data scientists need efficient ways to organize, store, and access data. Understanding which data structure to use can significantly influence the efficiency of a given task, whether it’s feeding data into a model, storing results, or visualization.
Inbuilt Python Data Structures:
-
List: An ordered, indexable, and mutable structure that can hold duplicate items. Lists are versatile and commonly used because of their ability to easily access items by index.
-
Dictionary: Comprising key-value pairs, dictionaries are mutable and indexable by key. They’re excellent for efficient searching when keys are known, but can’t have duplicate keys.
-
Set: Unordered collections that cannot have duplicates. Useful when only unique values are required or to find overlaps between data sources.
-
Tuple: An ordered and indexable structure, similar to lists but immutable. They’re useful when ensuring that data remains unchanged, such as storing model outputs.
Abstract Data Types:
-
Queue: Mimics real-life queues with a First-In-First-Out (FIFO) logic. Useful for tasks like scheduling CPU jobs or breadth-first search algorithms.
-
Stack: Operates on a Last-In-First-Out (LIFO) logic, useful when the most recent addition is the primary concern, as in code interpretation or undo operations.
-
Linked List: Collections of nodes in a linear order, either singly (pointing forward) or doubly (pointing both ways). Efficient for insertions and deletions compared to traditional lists.
-
Graph: Represents entities (vertices) and their relations (edges). Useful in diverse scenarios from road networks to social media platforms.
Conclusion:
Data scientists frequently interact with various data structures, each with its benefits and trade-offs. Knowing when to use which structure is pivotal for efficient data handling and processing. Moreover, familiarity with these structures can be a great asset in data science and software engineering interviews.
Hope this helps!
All the best,
Shibani