Unit 8, Section A Quiz

  1. Question 1:

    Which characteristic of large data sets refers to the speed at which data is generated and collected?

    1. Volume
    2. Variety
    3. Velocity
    4. Veracity
  2. Question 2:

    What is the purpose of data aggregation in the analysis of large data sets?

    1. To identify and remove duplicate entries
    2. To summarize data and identify overall trends and patterns
    3. To visualize data using charts and graphs
    4. To replace missing values with estimates
  3. Question 3:

    Which technique involves grouping similar data points together based on shared characteristics?

    1. Data Filtering
    2. Cluster Analysis
    3. Regression Analysis
    4. Data Aggregation
  4. Question 4:

    What is the main goal of data cleaning and preparation?

    1. To improve data quality and ensure it is ready for analysis
    2. To visualize data in graphical formats
    3. To calculate descriptive statistics such as mean and median
    4. To create machine learning models for prediction
  5. Question 5:

    Which of the following is a common issue encountered in raw data?

    1. High resolution
    2. Consistent formatting
    3. Low volume
    4. Missing values
  6. Question 6:

    What is the purpose of Principal Component Analysis (PCA)?

    1. To standardize data for analysis
    2. To classify observations into predefined categories
    3. To identify outliers and anomalies in data
    4. To reduce the dimensionality of data while preserving variance
  7. Question 7:

    Which of the following is a key Python library used for data manipulation and analysis?

    1. ggplot2
    2. dplyr
    3. Pandas
    4. caret
  8. Question 8:

    In multivariate data analysis, what is the purpose of factor analysis?

    1. To reduce the number of variables while retaining variance
    2. To classify data points into different clusters
    3. To create visualizations for data exploration
    4. To identify underlying factors that explain correlations among variables
  9. Question 9:

    Which of the following is a benefit of using R and Python for data analysis?

    1. Limited support for data visualization
    2. Closed-source development and proprietary libraries
    3. Versatility and wide range of libraries for various tasks
    4. Lack of active user communities and resources
  10. Question 10:

    What is the role of descriptive statistics in data analysis?

    1. To summarize data using measures of central tendency and variability
    2. To make predictions based on data
    3. To clean and prepare data for analysis
    4. To visualize complex data using interactive tools