Data types play a fundamental role in data science, as they determine the kind of data that can be stored, manipulated, and analyzed. Python provides several built-in data types that are commonly used in data science workflows. Understanding these data types is essential for effective data manipulation and analysis.
Numeric data types are used to represent numerical values. The common numeric data types in Python are:
Here's an example of using numeric data types:
age = 25 # Integer
height = 1.75 # Float
complex_num = 3 + 4j # Complex
Strings are used to represent textual data. They are enclosed in single quotes ('') or double quotes (""). Strings in Python are immutable, meaning they cannot be changed once created.
Here's an example of using strings:
name = 'John Doe'
message = "Hello, world!"
Boolean data type represents either true or false values. It is often used for logical operations and conditions in data science.
Here's an example of using boolean values:
is_valid = True
has_permission = False
Lists are used to store multiple items in a single variable. They are mutable and can contain elements of different data types. Lists are represented by square brackets ([]).
Here's an example of using lists:
numbers = [1, 2, 3, 4, 5]
names = ['John', 'Jane', 'Alice']
mixed_list = [1, 'apple', True]
Dictionaries are used to store key-value pairs. They are mutable and allow fast access to values using unique keys. Dictionaries are represented by curly braces ({}) and use a colon (:) to separate keys and values.
Here's an example of using dictionaries:
person = {
'name': 'John Doe',
'age': 30,
'location': 'New York'
}
These are just a few examples of the commonly used data types in data science. Python offers many more data types and advanced data structures that provide flexibility and efficiency for various data manipulation tasks.