Data Science Objective Set 2

Question: Select the bias types which describe sampling bias.

Projection bias

Negativity bias

Self-selection bias

Exclusion bias

Survivorship bias

Ans:-

Self-selection bias

Exclusion bias

Question: Select the definition of the Coefficient of Variation.

Square of the standard deviation divided by the mode

Absolute value of the Z-score divided by the mean

Square root of the sum of deviations from the mean

Ratio of the standard deviation to the mean

Ans:- Ratio of the standard deviation to the mean

Question: Select the answer that describes the bias in an estimator.

The estimator always tends to the mean

The difference between the true value and expected value

Only the variance can be biased

The estimator is always off by one

Ans:- The difference between the true value and expected value

Question: Which distribution describes the plot of sample means from any random distribution?

Uniform

Binomial

Normal

Poisson

Ans:- Normal

Question: A data science platform must provide an environment flexible enough to integrate a variety of tools and tool types including which key programming languages?

R

JavaScript

Python

HTML

Ans:-

R

Python

Question: When deploying data science tools, software engineering best practices should be adhered to. Which type of tool would you use for centralized management of code?

Version control

RDBMS

Visual Studio Code

DVD

Ans:- Version control

Question: Coverage is a key data science tool consideration. What does coverage refer to?

The types of projects covered

The platform’s core capabilities

Test code coverage

The platform’s ability to cover data

Ans:- The platform’s core capabilities

Question: In which step in the data science workflow would one typically perform feature engineering?

Define objective

Explore/clean data

Evaluate/tune model

Import data

Ans:- Explore/clean data

Question: When using data science tools to perform text exploration, what kind of data is typically being analysed?

Relational data

Unstructured data

CSV data

Structured data

Ans:- Unstructured data

Question: Which of these are considered common uses for data science visualization tools?

Discover new features

Generate models

Restructure data

Explore information

Ans:-

Discover new features

Explore information

Ans:-

Question: When using a data science database tool to acquire streaming data from a device, when is the data processing typically performed?

On the device

After import

Before import

In real-time

Ans:- In real-time

Question: Which of these are valid benefits of deploying cloud-based tools?

Reliability

Availability

Scalability

Security

Compliance

Ans:-

Reliability

Availability

Scalability

Question: Which of these are valid challenges of deploying cloud-based tools?

Scalability

Security

Network latency

Regulatory compliance

Performance

Ans:-

Security

Network latency

Regulatory compliance

Question: Which of these are typical functionalities of DevOps?

Deployment

Evaluating

Testing

Integration

Cleaning

Ans:-

Deployment

Testing

Integration

Question: Working within DevOps for data science, which types of resources would be subject to automated testing?

IoT devices

Model performance

Data quality

Containers

Ans:-

Model performance

Data quality

Question: Match the following statements related to Seaborn with their correct boolean values.

Answer Options:
A:Seaborn is a data visualization library built on top of Matplotlib
B:Seaborn is part of the PyData stack which is the open data science stack available in Python
C:Seaborn allows the user to very finely control every detail of the plot and lets the user perform complex tasks with it

True

A

B

C
Ans:- A

False

A

B

C
Ans:- B,C

Question: Let’s say you have create a Dataframe called “data” and you call the describe() function on this dataframe. What does this describe() function return?

It returns the datatype of each of the columns of the dataframe

It returns the details like number of rows in the dataframe, the number of columns in the dataframe, total number of cells in the dataframe etc.

It returns a summary of all of the string columns in the dataframe

It returns the summary of all of the numeric columns of the Dataframe by default. The summary includes the count, the mean, the standard deviation, the minimum, the maximum values of the columns etc.

Ans:- It returns the summary of all of the numeric columns of the Dataframe by default. The summary includes the count, the mean, the standard deviation, the minimum, the maximum values of the columns etc.

Question: Match the following statements related to the distplot function in Seaborn with the correct values.

Answer Options:
A:Distplot is used for visualizing the distribution of a single column of data in a Dataframe
B:By default, the Seaborn distribution plot automatically plots a smooth representation of the distribution of data across the range of values passed into the function (a KDE curve)
C:The orientation of the distribution of the data passed into this function cannot be changed and is fixed. The range values will always be along the x-axis.

True

A

B

C
Ans:- A,B

False

A

B

C
Ans:- C

Question: Which of these plots is NOT rendered by the Seaborn distplot function?
Instruction: Choose the option that best answers the question.

Histogram

Kernel Density Estimation curve

2-D Scatter Plot

Rug plot

Ans:- 2-D Scatter Plot

Question: What does the bandwidth of a KDE curve determine?

The bandwidth of the curve determines what portion of the entire range of values will be considered to plot the KDE estimate at any point

The bandwidth of a KDE curve determines the smoothness of the curve. Lower the bandwidth, smoother the curve

The bandwidth of a KDE curve determines the length of the curve

The bandwidth of a KDE curve determines the smoothness of the curve. Higher the bandwidth, smoother the curve

Ans:-

The bandwidth of the curve determines what portion of the entire range of values will be considered to plot the KDE estimate at any point

The bandwidth of a KDE curve determines the smoothness of the curve. Higher the bandwidth, smoother the curve

Question: Which of these Seaborn functions here can be used to plot the distribution of bi-variate data?

rugplot

kdeplot

jointplot

distplot

Ans:-

kdeplot

jointplot

Question: What is the output after you have passed 4 variables as input data to the pairplot function?

It outputs a 4×4 grid with a univariate distribution for the corresponding input

It outputs a 2-D scatter plot for any 4 pairs of variables against each other

variables It outputs a 4×4 grid with a 2-D scatter plot for every pair of variables against each other along with the univariate distribution for the corresponding input variables

It outputs a 2-D scatter plot for the first variable against the other 3 variables passed to it

Ans:- variables It outputs a 4×4 grid with a 2-D scatter plot for every pair of variables against each other along with the univariate distribution for the corresponding input variables

Question: What is the ‘hue’ argument in the Seaborn pairplot function used for?

It is used for specifying the gaps between all of the subplots that are rendered by the pairplot function

It is used for specifying what kind of plot we are going to use for all of our subplots

It is used to specify the colors for the markers in our scatter plot

It is used for specifying the x-variables and the y-variables that we are going to use in order to construct our plots

Ans:- It is used to specify the colors for the markers in our scatter plot

Question: Seaborn has a built-in function called set_context which helps set the details of the plot such as labels, lines, grids, and other plot elements so that the plot is best suited to the context in which you want to present it in. What are the different context modes in Seaborn?

talk

paper

meetup

poster

Ans:-

talk

paper

poster

Question: Among the following list of people/groups of people, whom would you consider as outliers in their field?

The Beatles – often considered the most influential rock band of all time

Patrick Klandt – a professional soccer player who plays for a mid-level team

Serena Williams – the record holder for number of major wins in tennis

Sezz Medi – an Italian restaurant which operated for 4 years before shutting down

Ans:-

The Beatles – often considered the most influential rock band of all time

Serena Williams – the record holder for number of major wins in tennis

Question: Some of the features of boxplots are stated below. Match them with their correct boolean values

Answer Options:
A:The vertical lines of a box plot represent the range of distribution of data
B:The boxes represent the inter-quartile distribution i.e. the data from the 25th percentile right up to the 75th percentile
C:The max value of the data is represented by the horizontal line within the box
D:The outliers of the data are also captured by the whiskers of the box plot

False

A

B

C

D
Ans:- C,D
True

A

B

C

D
Ans:A,B

Question: Match the following types of Seaborn plots with their correct description.
Instruction: Match each answer with the correct target. Each answer can only be used once.
Answer Options:
A:stripplot
B:countplot
C:pointplot
D:catplot

Similar to a histogram, with a bar for every categorical variable

A

B

C

D
Ans:- D

It renders a line connecting a number of points where each point represents the mean value for every category

A

B

C

D
Ans:- A

Similar to a scatter plot except in this case, one of our variables is categorical in nature

A

B

C

D
Ans:- C

Question: Some of the features of the Seaborn FacetGrid function are given below. Which of these are true?

We don’t have to explicitly mention the data that has to be represented in the graph or the type of graph that we want rendered. FacetGrid detects this automatically

FacetGrid cannot be used to perform analysis on bi-variate data

It allows the user to plot a different graph for each category present in a range of data

Ans:- It allows the user to plot a different graph for each category present in a range of data

Question: Let’s say you are generating multiple plots using the Seaborn FacetGrid but you want only three graphs to be displayed in a single row. How do you do this?

By setting the ‘col’ argument of the FacetGrid function to 3

By setting the ‘col_wrap’ argument of the FacetGrid function to 3

By setting the ‘row_wrap’ argument of the FacetGrid function to 3

By setting the ‘row’ argument of the FacetGrid function to 3

Ans:- By setting the ‘col_wrap’ argument of the FacetGrid function to 3

Question: When using colour palettes to colour your plots, how does Seaborn work when there are multiple objects to be coloured?

Seaborn cycles through each of the colours in the colour pallet in a particular order and sets a colour to a given variable depending on the order in which the variables appear

It sets a colour to a given variable depending on the name of the variable

Different variables are given different colours in random order

Ans:- Seaborn cycles through each of the colours in the colour pallet in a particular order and sets a colour to a given variable depending on the order in which the variables appear

Question: Some of the statements related to colour pallets in Seaborn are given below. Match them with their correct boolean values.
Instruction: Match each option with its correct target. Each category has a single match.

Answer Options:
A:True
B:False

Adjacent colors in qualitative palettes vary a lot

A

B
Ans:- A

Sequential palettes allow values close to each other to have similar shade and values far from each other to have completely different shades

A

B
Ans:- A

Sequential palettes are better suited for categorical data

A

B
Ans:- B

Question: What is the ‘hue’ argument in the Seaborn pairplot function used for?

It is used for specifying the gaps between all of the subplots that are rendered by the pairplot function

It is used for specifying what kind of plot we are going to use for all of our subplots

It is used to specify the colors for the markers in our scatter plot

It is used for specifying the x-variables and the y-variables that we are going to use in order to construct our plots

Ans:- It is used to specify the colors for the markers in our scatter plot

Question: Seaborn has a built-in function called set_context which helps set the details of the plot such as labels, lines, grids, and other plot elements so that the plot is best suited to the context in which you want to present it in. What are the different context modes in Seaborn?

talk

paper

meetup

poster

Ans:-

talk

paper

poster

Question: Among the following list of people/groups of people, whom would you consider as outliers in their field?

The Beatles – often considered the most influential rock band of all time

Patrick Klandt – a professional soccer player who plays for a mid-level team

Serena Williams – the record holder for number of major wins in tennis

Sezz Medi – an Italian restaurant which operated for 4 years before shutting down

Ans:-

The Beatles – often considered the most influential rock band of all time

Serena Williams – the record holder for number of major wins in tennis

Question: Some of the features of boxplots are stated below. Match them with their correct boolean values

Answer Options:
A:The vertical lines of a box plot represent the range of distribution of data
B:The boxes represent the inter-quartile distribution i.e. the data from the 25th percentile right up to the 75th percentile
C:The max value of the data is represented by the horizontal line within the box
D:The outliers of the data are also captured by the whiskers of the box plot

False

A

B

C

D
Ans:- C,D
True

A

B

C

D
Ans:A,B

Question: Match the following types of Seaborn plots with their correct description.
Instruction: Match each answer with the correct target. Each answer can only be used once.
Answer Options:
A:stripplot
B:countplot
C:pointplot
D:catplot

Similar to a histogram, with a bar for every categorical variable

A

B

C

D
Ans:- D

It renders a line connecting a number of points where each point represents the mean value for every category

A

B

C

D
Ans:- A

Similar to a scatter plot except in this case, one of our variables is categorical in nature

A

B

C

D
Ans:- C

Question: Some of the features of the Seaborn FacetGrid function are given below. Which of these are true?

We don’t have to explicitly mention the data that has to be represented in the graph or the type of graph that we want rendered. FacetGrid detects this automatically

FacetGrid cannot be used to perform analysis on bi-variate data

It allows the user to plot a different graph for each category present in a range of data

Ans:- It allows the user to plot a different graph for each category present in a range of data

Question: Let’s say you are generating multiple plots using the Seaborn FacetGrid but you want only three graphs to be displayed in a single row. How do you do this?

By setting the ‘col’ argument of the FacetGrid function to 3

By setting the ‘col_wrap’ argument of the FacetGrid function to 3

By setting the ‘row_wrap’ argument of the FacetGrid function to 3

By setting the ‘row’ argument of the FacetGrid function to 3

Ans:- By setting the ‘col_wrap’ argument of the FacetGrid function to 3

Question: When using colour palettes to colour your plots, how does Seaborn work when there are multiple objects to be coloured?

Seaborn cycles through each of the colours in the colour pallet in a particular order and sets a colour to a given variable depending on the order in which the variables appear

It sets a colour to a given variable depending on the name of the variable

Different variables are given different colours in random order

Ans:- Seaborn cycles through each of the colours in the colour pallet in a particular order and sets a colour to a given variable depending on the order in which the variables appear

Question: Some of the statements related to colour pallets in Seaborn are given below. Match them with their correct boolean values.
Instruction: Match each option with its correct target. Each category has a single match.

Answer Options:
A:True
B:False

Adjacent colors in qualitative palettes vary a lot

A

B
Ans:- A

Sequential palettes allow values close to each other to have similar shade and values far from each other to have completely different shades

A

B
Ans:- A

Sequential palettes are better suited for categorical data

A

B
Ans:- B

Question: Match the following statements related to the lmplot function in Seaborn with the correct values.
Instruction: Match each option with its correct target. Each category may have more than one match.
Answer Options:
A:lmplot will render a regression plot on a facet grid
B:lmplot is capable of rendering only a single regression plot between two variables
C:We can make our markers more evenly distributed in the lmplot by adding in jitters, but doing so will affect the regression line

False

A

B

C
Ans:- B,C

True

A

B

C
Ans:- A

Question: What is the Seaborn despine() function used for?

It can be used to set a gap between the axes of our visualization and the plot itself

The despine function is used to change the style and aesthetics of the grid lines of our plot

It is used to remove parts or all of the box inside which our visualizations are rendered

By default, our visualizations will not be plotted inside a box and the despine function is used to render our visualizations inside a box

Ans:-

It can be used to set a gap between the axes of our visualization and the plot itself

It is used to remove parts or all of the box inside which our visualizations are rendered

Question: Let’s say you have a grouped dataframe called “gdf” which has a column of integers called “c1”.
What would gdf.c1.sum() return?

The total number of groups

For each group, we get the sum of their values in “c1”

The number of records in each group

The total number of records in the grouped dataframe

Ans:- For each group, we get the sum of their values in “c1”

Question: What does the “values” member variable of a pandas dataframe return?

The contents of the dataframe in the form of a python list

The contents of the dataframe in the form of a pandas series

The contents of the dataframe in the form of a numpy array

The dataframe in a dictionary form

Ans:- The contents of the dataframe in the form of a numpy array

Question: Let’s say you have a multi-indexed dataframe called “multi_index” with three columns “c1”, “c2” and “c3” where the third column “c3” contains integer values.

You want to first group this dataset by the values in “c2” and within each value in “c2”, you want to group by “c1”, in order to calculate the sum of values in “c3” for each of these subgroups.

Which of these functions can help do so?

multi_index.groupby (level = [‘c2’]) .sum()

multi_index.groupby (level = [‘c1’, ‘c2’]) .sum()

multi_index.groupby (level = [‘c1’]).sum()

multi_index.groupby (level = [‘c2’, ‘c1’]) .sum()

Ans:- multi_index.groupby (level = [‘c2’, ‘c1’]) .sum()

Question: Match the following quantities, with the correct function that you need to pass to the aggregate function, called on a multi-indexed grouped dataframe, to return it.

Answer Options:
A:sum of all the quantities in the group
B:mean value of all the quantities in the group
C:minimum value of all the quantities in the group
D:maximum value of all the quantities in the group

np.max

A

B

C

D
Ans:- D

np.sum

A

B

C

D
Ans:- A

np.min

A

B

C

D
Ans:- C

np.mean

A

B

C

D
Ans:- B

Question: What is returned when you call the “isin” function, with one of the values in the column specified as a filter, on a dataframe?

Pandas series of string values where the value is false if that index had the same value as the filter and true otherwise

Pandas series with the index values where the values in the column do not match with the filter

Pandas series with the numeric values where the values in the column match with the filter

Pandas series of boolean values where the value is true if that index had the same value as the filter and false otherwise

Ans:- Pandas series of boolean values where the value is true if that index had the same value as the filter and false otherwise

What will be the values present in “new_series” by the end of this program?

import pandas as pd
pandas_series = pd.Series([1, 2, 3, 4])
new_series = pandas_series.mask(pandas_series > 1)

Result A.
0 1.0
1 NaN
2 NaN
3 NaN

Result B.
0 2.0
1 3.0
2 4.0

Result C.
0 NaN
1 2.0
2 3.0
3 4.0

Result D.
0 1.0

Result A

Result B

Result C

Result D

Ans:- Result A

Let’s say you have a dataframe called “df” which has a column called “quantity”, what would be the output when you execute the following line of code:

df.duplicated(‘quantity’)

//dataframe df column “quantity” values
0
4
1
1
3
0
3

Result A.

0 True
1 False
2 True
3 True
4 True
5 True
6 True

Result B.

0 False
1 True
2 False
3 False
4 False
5 False
6 False

Result C.

0 False
1 False
2 False
3 True
4 False
5 True
6 True

Result D.

0 True
1 True
2 True
3 False
4 True
5 False
6 False

Result B

Result D

Result A

Result C

Ans:- Result C

When creating a dataframe using the DataFrame constructor, what is the default dtype assigned to a column consisting of string values?

object

category

integer

string

Ans:- object

Let’s say you’re trying to apply an inequality filter to a column in a dataframe.

For what column dtypes would the program throw an error when trying to perform this operation?

object

integer

unordered categorical

ordered categorical

Ans:- unordered categorical

What does the ffill() function do when you call it on your dataframe?

NaN values in the beginning of all the columns in the dataframe are filled with the first observed value of that column

Removes the records with NaN values

NaN values in the end of all the columns in the dataframe are filled with interpolated values of that column

Replaces the NaN value with -1

Ans:- NaN values in the end of all the columns in the dataframe are filled with interpolated values of that column

Question: Besides accuracy, what else does veracity refer to?

Meaningless

Truthfulness

Noisiness

Cleanliness

Ans:- Truthfulness

Question: Which is not an example of a use case of Big Data?

Telecommunication companies reducing IT costs

Epidemic prediction

Updating menus with high-priced items

Airlines collecting engine information

Ans:- Updating menus with high-priced items

Question: Which is not one of the four V’s of Big Data?

Veracity

Velocity

Volume

Value

Ans:- Value

Question: What’s an example of something you can control?

The accuracy of data

The speed of user data

Better decision making

The amount of data sources

Ans:- Better decision making

Question: Which interactivity feature is available by default in an Altair bar chart?

Save as SVG

Save as PNG

Zoom in

Pan

Zoom out

Ans:- Save as SVG

Question: Which statement best defines a wide form dataset?

A wide form dataset has one row per independent variable, with metadata recorded in the row and column labels

A wide form data set is a data set with more than ten columns and less than a hundred rows

A wide form data set is a data set with more columns than rows in the entire data set

A wide form dataset has one row per observation, with metadata recorded within the table as values

Ans:- A wide form dataset has one row per independent variable, with metadata recorded in the row and column labels

Question: Which error will be thrown if you attempt to visualize a dataset with more than 5000 rows by default?

ExceedsLimitationError

IntegrityError

No error will be thrown at all, the code will work fine

MaxRowsError

Ans:- MaxRowsError

Question: Which commands can be used to install Altair, Vega, and Vega Lite from a Jupyter notebook?

!install altair, vega, vega_datasets

!pip install altair vega vega_datasets

python install altair, vega, vega_datasets

install altair, vega, vega_datasets

Ans:- !pip install altair vega vega_datasets

Question: How would you parameterize a call to the alt.X() constructor to specify the number of bins in a histogram?

Using the “binning” input argument to the alt.X() function

Using the “bins” input argument to the alt.X() function

Using the “bin” input argument to the alt.X() function

Using the “num_bins” input argument to the alt.X() function

Ans:- Using the “bin” input argument to the alt.X() function

Question: Which statements can be used to create a brush which selects a range on the X axis in a chart?

alt.selection_interval(encodings = x)

alt.brush(axis = x)

alt.brush()

alt.selection_interval(axis = x)

Ans:- alt.selection_interval(encodings = x)

Question: How would you parameterize a call to the alt.chart().mark_boxplot().encode() property to specify the color in a box plot?

Using the “palette” input argument to the alt.chart().mark_boxplot().encode() property

Using the “col” input argument to the alt.chart().mark_boxplot().encode() property

Using the “hue” input argument to the alt.chart().mark_boxplot().encode() property

Using the “color” input argument to the alt.chart().mark_boxplot().encode() property

Ans:- Using the “color” input argument to the alt.chart().mark_boxplot().encode() property

Question: Which are valid inputs to the “sort” parameter to the alt.X() constructor?

“+”

“asce”

“desc”

“y”

“-y”

Ans:-

“y”

“-y”

Question: You would like to create a line chart with step interpolation. How would you parameterize your call to the alt.Chart().mark_line() method?

alt.Chart().mark_line(interpolation = ‘step’,…)

alt.Chart().mark_line(interpolation_mode = ‘step’,…)

alt.Chart().mark_line(interpolate = “step”,…)

alt.Chart().mark_line(step = True,…)

Ans:-alt.Chart().mark_line(interpolate = “step”,…)

Question: You would like to create a brush which selects a range of data points in a scatter plot. How would you parameterize your call to the alt.selection() function?

alt.selection (range = “interval”,…)

alt.selection(type = “interval”,…)

alt.selection (apply = “interval”,…)

alt.selection(select = “interval”,…)

Ans:- alt.selection(type = “interval”,…)

Question: Which function from the alt.Chart() class can be used to create a violin plot?

alt.Chart().mark_kernel()

alt.Chart().mark_area()

alt.Chart().mark_violin()

alt.Chart().mark_KDE()

Ans:- alt.Chart().mark_area()

Question: Which function from the alt.Chart() class can be used to create a scatter plot with hollow points?

alt.Chart().mark_circle()

alt.Chart().mark_point()

alt.Chart().mark_hollow_scatter()

alt.Chart().mark_scatter()

Ans:- alt.Chart().mark_point()

Question: Which Altair classe can be used to add conditional formatting to a chart?

alt.condition()

alt.Condition()

alt.conditional_formatting()

alt.ConditionalFormatting()

Ans:- alt.condition()

Question: You would like to add a dark-green line to an area chart. How would you parameterize your call to the alt.Chart().mark_area() method?

alt.Chart().mark_area(line_color = “darkgreen”,…)

alt.Chart().mark_area(line_layout = {“color” : “darkgreen”},…)

alt.Chart().mark_area(outline = {“color” : “darkgreen”},…)

alt.Chart().mark_area(line = {“color” : “darkgreen”},…)

Ans:- alt.Chart().mark_area(line = {“color” : “darkgreen”},…)

Question: Which statement accurately describes a trellis area chart?

A trellis area chart is an area chart which uses a gradient to color the area

A trellis area chart is an area chart with an outline for each of the areas being visualized

A trellis area chart is an area chart with multiple categories being visualized

A trellis area chart is an area chart with a separate chart for every category

Ans:- A trellis area chart is an area chart with a separate chart for every category

Question: What kind of variables can be placed on the X and Y axis of a scatter plot?

The X axis has to be continuous and the Y axis can be either categorical or continuous

Both of the axes in a scatter plot can be either categorical or continuous

The X axis has to be continuous and the Y axis has to be continuous

The X axis has to be categorical and the Y axis has to categorical

Ans:- The X axis has to be continuous and the Y axis has to be continuous

Question: Assume you have two line charts stored in variables called line_01 and line_02.
Which statement can be used to return both of these lines in the same chart?

alt.combine(line_01, line_02)

line_01 | line_02

line_01 & line_02

alt.layer(line_01, line_02)

Ans:- alt.layer(line_01, line_02)

Question: Why does volume matter?

By 2020 it’s expected that we’ll have 55 times the data we had in 2010

We don’t have enough sensors

Data storage is plentiful

80 percent of data was created in the past two years

Ans:- 80 percent of data was created in the past two years

Question: What causes the variety problem?

Reduction in complexity

No more opportunity

More structured data

Increasing ways data is received

Ans:- Increasing ways data is received

Question: Which is not a principle of variety?

Data can be passive

Data is always structured

Variety is expensive

Variety means the same kind of data

Ans:- Data is always structured

Question: You would like to create a green colored map. How would you parameterize your call to the alt.Chart().mark_geoshape() method?

alt.Chart().mark_geoshape(fill = “green”,…)

alt.Chart().mark_geoshape(color = “green”,…)

alt.Chart().mark_geoshape(hue = “green”,…)

alt.Chart().mark_geoshape(format = {“color” : “green”},…)

Ans:- alt.Chart().mark_geoshape(fill = “green”,…)

Question: How would you parameterize your call to the alt.Chart().transform_aggregate() method to create a plot with one marker for each category of a column?

Using the “groupby” input argument to the alt.Chart().transform_aggregate() method

Using the “group” input argument to the alt.Chart().transform_aggregate() method

Using the “group_by” input argument to the alt.Chart().transform_aggregate() method

Using the “aggregate” input argument to the alt.Chart().transform_aggregate() method

Ans:- Using the “groupby” input argument to the alt.Chart().transform_aggregate() method

Question: Which function from the alt.Chart() class can be used to create a heat map?

alt.Chart().mark_heatmap()

alt.Chart().mark_map()

alt.Chart().mark_heat()

alt.Chart().mark_rect()

Ans:- alt.Chart().mark_rect()

Question: Which of the following best defines a ranged dot plot?

A dot plot with two or more dots representing a range of values which are connected by a line

A dot plot with dots representing a continuous range of values

A dot plot with two dots representing a range of values

A dot plot with all the dots in the plot connected by a line

Ans:- A dot plot with two or more dots representing a range of values which are connected by a line

Question: You would like to create a y axis on the right side of a chart. How would you parameterize your call to the alt.Axis() constructor?

alt.Axis(orientation = “right”,…)

alt.Axis(axis = “right”,…)

alt.Axis(position = “right”,…)

alt.Axis(orient = “right”,…)

Ans:- alt.Axis(orientation = “right”,…)

Question: Which statement can be used to create a brush which selects a single point in a scatter chart?

alt.selection_single()

alt.selection_interval()

alt.brush_single()

alt.brush()

Ans:- alt.brush_single()

Question: Which classes are required to create a candlestick chart in Altair?

alt.OHLC

alt.Y2

alt.Y

alt.Chart

alt.Candle

Ans:-

alt.Y2

alt.Y

alt.Chart

Question: You want to create a color scale in a variable “color_scale” with colors for the categories “X”, “Y”, “Z” for use in a chart.
Which statement can be used to achieve this?

color_scale = alt.ColorScale(categories = [“X”, “Y”, “Z”])

color_scale = alt.ColorScale(domain = (“X”, “Y”, “Z”))

color_scale = alt.Scale(categories = [“X”, “Y”, “Z”])

color_scale = alt.Scale(domain = [“X”, “Y”, “Z”])

Ans:- color_scale = alt.Scale(domain = [“X”, “Y”, “Z”])

Question: How would you parameterize your call to the alt.Chart().mark_bar() method so that the width of the bars in a bar chart also correspond to a variable?

Using the “x_width” input argument to the alt.Chart().mark_bar() function

Using the “x2” input argument to the alt.Chart().mark_bar() function

Using the “width” input argument to the alt.Chart().mark_bar() function

Using the “bar_width” input argument to the alt.Chart().mark_bar() function

Ans:- Using the “x_width” input argument to the alt.Chart().mark_bar() function

Question: Which statements about a default representation of a strip plot in Altair is true?

The X and Y axes of a strip plot are both bucketed

One axis of a strip plot is always continuous

One axis of a strip plot is always categorical

A strip plot visualizes univariate data using bars

Ans:-

One axis of a strip plot is always continuous

One axis of a strip plot is always categorical

Question: Which classe can be used to perform a sort operation on a variable in a dash chart?

alt.Sort

alt.SortX

alt.SortField

alt.SortY

Ans:- alt.SortField

Question: Which of these statements best defines clustered bar charts?

A bar chart where the bars are stacked upon one another

A bar chart where the bars slope downwards

A bar chart without lines separating each bar

A bar chart which visualizes multiple variables

Ans:- A bar chart which visualizes multiple variables

Question: Which functions can be used to create a second axis which shares the same x-axis as the first?

ax.commonx()

ax.sharex()

ax.twinx()

ax.X()

Ans:- ax.twinx()

Question: How can multiple sheets be created when exporting data frames to Excel in R using writexl?

By chaining write_xlsx function calls

By specifying a list of data frames

Multiple sheets cannot be written using writexl

By first opening the excel file and appending the new sheet

Ans:- By specifying a list of data frames

Question: By default, what is read by the html_table function from the rvest library?

The last table in the HTML document

A vector containing the tables in the HTML document

A list containing the tables in the HTML document

The first table in the HTML document

Ans:- A list containing the tables in the HTML document

Question: Select the description of what action the following code performs.
Code Editor:
sink(“file.txt”)

file.txt will be printed to the console

Console output will be redirected to file.txt

file.txt will be read into an R data frame

file.txt will be deleted

Ans:- Console output will be redirected to file.txt

Question: Given the result of the read_excel function from the readxl package, what function is then used to convert the resulting object into an R data frame?

as.data.frame

to_dataframe

tibble_to_dataframe

as_dataframe

Ans:- as.data.frame

Question: How is data represented in Lollipop charts?

Using data points and thin vertical bars

Using thin vertical bars with whiskers

Using thin vertical bars

Using data points and curves

Ans:- Using data points and thin vertical bars

Question: You would like to create a histogram without displaying the lines which separate the individual bars. How would you parameterize your call to the plt.hist() function?

plt.hist(type = ‘joint’,…)

plt.hist(hist = ‘step’,…)

plt.hist(kind = ‘step’,…)

plt.hist(histtype = ‘step’,…)

Ans:- plt.hist(histtype = ‘step’,…)

Question: If you want to create a histogram which visualizes probability values of records, how would you parameterize your call to the plt.hist() function?

Using the “kernel” input argument to the plt.hist() function

Using the “kde” input argument to the plt.hist() function

Using the “density” input argument to the plt.hist() function

Using the “distribution” input argument to the plt.hist() function

Ans:- Using the “density” input argument to the plt.hist() function

Question: Match the CSV read method with its type of CSV.
Instruction: Match each answer with the correct target. Each answer can only be used once.

Answer Options:
A:read.csv
B:read.csv2
C:read.delim

comma separator

A

B

C
Ans:- A

semi-colon separator

A

B

C
Ans:- B

tab separator

A

B

C
Ans:- C

Question: What is the default separator and decimal character used by write.csv?

semi-colon and comma

tab and period

comma and period

tab and comma

Ans:- comma and period

Question: Which functions can be performed on a figure object to create an axes object in that figure?

fig.axes()

fig.add_axes()

fig.create_axes()

fig.add_axis()

Ans:- fig.add_axes()

Question: Which Matplotlib backends are interactive?

gtk3

qt4

ps

svg

inline

Ans:-

gtk3

qt4

inline

Question: Which pandas function can be used to create a DataFrame with separate rows for a category?

pd.group_by()

pd.category()

pd.group()

pd.cat()

Ans:- pd.group_by()

Question: What is returned from the dplyr mutate function when operating on a tibble?

The tibble is converted into a list

A new row is added based on a function of existing rows

A new column is added based on a function of existing columns

A column’s values are modified according to a function

Ans:- A new column is added based on a function of existing columns

Question: Which of the following are examples of summary functions?

min

select

slice

median

left_join

max

Ans:-

min

median

max

Question: When performing a left join, right join, or full join, how does dplyr handle unmatched values?

By inserting NA

By dropping the row

By inserting 0

By inserting NULL

Ans:- By inserting NA

Question: Which matplotlib.pyplot functions can be used to create a tuple with the figure and axes of a chart?

plt.subplots()

plt.plot()

plt.axes()

plt.figure()

Ans:- plt.subplots()

Question: If you want to customize the size of the text of the x-axis label in a chart, how would you parameterize your call to the plt.xlabel() function?

Using the “label_size” input argument to the plt.xlabel() function

Using the “text_size” input argument to the plt.xlabel() function

Using the “fontsize” input argument to the plt.xlabel() function

Using the “size” input argument to the plt.xlabel() function

Ans:- Using the “fontsize” input argument to the plt.xlabel() function

Question: Assume you have data for the opening price and a closing price of a stock over a period of time.
If you want separate lines to represent the opening and closing prices of the stick, how will you parameterize your call to the matplotlib.pyplot.plot() function?

Using the “color” input argument to the matplotlib.pyplot.plot() function

Using the “hue” input arguments to the matplotlib.pyplot.plot() function

Using the “extra_cat” input argument to the matplotlib.pyplot.plot() function

Using the “cat” input argument to the matplotlib.pyplot.plot() function

Ans:- Using the “color” input argument to the matplotlib.pyplot.plot() function

Given the dplyr tibble object created, how many rows will be returned in the result of the filter?

pineapples <- tibble( country = c(“Costa Rica”, “Brazil”, “Philippines”, “Thailand”, “Indonesia”), production = c(2.7, 2.5, 2.4, 2.2, 1.8) ) filter(pineapples, production > 2.0)

0

2

1

3

4

Ans:- 4

What special operator is used by dplyr to pass a function argument to one of its methods?

>

%<%

%*%

%>%

|

Ans:- %>%

Which R class most closely resembles the dplyr tibble?

list

raw

data.frame

vector

Ans:- data.frame

Question: When does it make sense to use treemaps?

To analyze proportions of individual categories at various points in time

Show trends over time where there are many ordered data points

Show the trend of a stock’s performance based on the high, low, and close of that stock over some days

To analyze proportions of individual categories

Ans:- To analyze proportions of individual categories

Question: What data does a heatmap convey?

The correlation matrix between all pairs of variables

The median of all variables

The 25th and 75th percentile for all variables

The outliers in all variables

Ans:- The median of all variables

Question: When does it make sense to use pie charts?

To represent and visualize hierarchical information

To visualize the relationship between continuous variables

To analyze proportions of individual categories

Show trends over time where there are many ordered data points

Ans:- To analyze proportions of individual categories

Question: If you want to create a box plot without points representing outliers, how would you parameterize your call to the plt.boxplot() function?

Using the “showfliers” input argument to the plt.boxplot() function

Using the “outlier_markers” input argument to the plt.boxplot() function

Using the “showoutliers” input argument to the plt.boxplot() function

Using the “outliers” input argument to the plt.boxplot() function

Ans:- Using the “showfliers” input argument to the plt.boxplot() function

Given the dplyr tibble object created, which column will be in the output of the select statement?

pineapples <- tibble(
country = c(“Costa Rica”, “Brazil”, “Philippines”, “Thailand”, “Indonesia”),
production = c(2.7, 2.5, 2.4, 2.2, 1.8)
)
select(pineapples, -production)

country, production

country

production

NA

Ans:- country

How is the dplyr group_by method typically used?

When creating subsets of columns

In conjunction with dplyr join functions

In conjunction with dplyr summary functions

When filtering or slicing rows

Ans:- In conjunction with dplyr summary functions

Question: Which of these statements best defines auto-correlation?

Two variables which are not related to each other

A strong positive correlation between two variables

A strong negative correlation between two variables

The correlation of a variable with itself shifted in time

Ans:- The correlation of a variable with itself shifted in time

Question: Which of these statements is true about scatter plots?

The distribution of a continuous variable can be visualized

Two variables can be visualized in a scatter plot

Two categorical variables can be visualized

Multiple pairs of variables can be visualized

Ans:-Two variables can be visualized in a scatter plot

Question: When does it make sense to use area charts?

To analyze proportions of individual categories

Show the trend of a stock’s performance based on the high, low, and close of that stock over some days

To analyze composition of multiple categories over a period of time

Show trends over time where there are many ordered data points

Ans:- To analyze composition of multiple categories over a period of time

Question: What can a single box plot convey?

The outliers

The median

The distribution

The count

The 25th and 75th percentile

Ans:-

The outliers

The median

The 25th and 75th percentile

Given the following line equation, and a y variable that can take on only positive values, for which value of x is y invalid.

y = -5x + 50

x = 1

x = 10

x = -1

x = 0

x = -10

Ans:- x = 10

What does a negative result in the cor function indicate?

There is a negative correlation between the variables

The correlation function has produced an error

There is no correlation between the variables

There is a positive correlation between the variables

Ans:- There is a negative correlation between the variables

What elements are returned by the code snippet demonstrating the dplyr setdiff function?

pcars %>% setdiff(pcars_training)

The columns of pcars that are not in pcars_training

The rows of pcars_training that are not contained in pcars

The rows of pcars that are also contained in pcars_training

The rows of pcars that are not in pcars_trainin1g

Ans:- The rows of pcars that are not in pcars_training01

Question: When does it make sense to use scatter charts or correlation heatmaps?

To create multiple charts grouped by a category

To explore relationships between pairs of variables in data

Show trends over time where there are many ordered data points

To analyze data using multiple chart types in a single chart

Ans:- To explore relationships between pairs of variables in data

Question: What type is internally used to store elements of a factor?

raw

character

logical

integer

complex

Ans:- integer

Given the following code snippet, which variable represents the dependent variable?

Ozone ~ Solar.R + Wind + Temp

Solar.R

Wind

Ozone

Temp

Ans:- Ozone

Select the variables contained in the summary of a linear model.

variance-covariance matrix

r-squared values

f-statistic

confidence interval

residuals

Ans:-

r-squared values

f-statistic

residuals

Question: Select the summaries computed on a data frame by the summary function.

median

mean

variance

mode

min

max

Ans:-

median

mean

min

max

Question: What is the default method for handling NA values in the sort function?

NA values are discarded

NA values are left where they originally occurred

NA values are sorted last

NA values must be removed before sorting

NA values are sorted first

Ans:- NA values are discarded

Match the regression method with its outcome type.

Answer Options:
A:binomial logistic regression
B:multinomial logistic regression
C:linear regression

two-valued

A

B

C

Ans:- A

multi-valued

A

B

C

Ans:- B

continuous

A

B

C

Ans:-C

Question: Given the following data frame code, select the valid methods for retrieving the production column.

pineapples <- data.frame(
country = c(“Costa Rica”, “Brazil”, “Philippines”, “Thailand”, “Indonesia”),
production = c(2.7, 2.5, 2.4, 2.2, 1.8)
)

pineapples[1,2]

pineapples[“production”]

pineapples[production]

pineapples[2,]

pineapples[,2]

pineapples%production

pineapples$production

Ans:-

pineapples[“production”]

pineapples[,2]

pineapples$production

Question: Given the following code, what is the result of the seq function?

seq(from = 2, to = 10, by = 2)

2 4 6 8

2 4 6 8 10

2 6 10

4 6 8

Ans:- 2 4 6 8 10

Question: Match the expression with the operation performed on the matrices.

Answer Options:
A:A * B
B:A %*% B

matrix multiplication

A

B

Ans:- B

element-wise multiplication

A

B

Ans:- A

Question: Given a decision tree outcome of three possible values, what does the predict function return for each prediction when given type = “prob” as an argument?

A vector containing the probability for each of the three possible outcomes

A single probability for the most likely outcome

A single value containing the most likely outcome

A vector of the possible outcomes in order of probability

Ans:- A vector containing the probability for each of the three possible outcomes

Question: What category of algorithm do clustering methods belong to?

supervised learning

classification

sorting

unsupervised learning

Ans:- unsupervised learning

Question: Given the following code, what are the contents of the vector v?
v <- 1:5
v[v < 3] <- 2

2 2

2 2 3 4 5

2 2 2 4 5

1 2 3 4 5

1 2 2 4 5

Ans:- 2 2 3 4 5

Question: Select the matrix produced by the following diag function.

diag(3)

A 3×3 matrix with 0s everywhere

A 3×3 matrix with 3s everywhere

A 1×1 matrix with the value 3

A 3×3 matrix with 1s on the main diagonal and 0s everywhere else

Ans:- A 3×3 matrix with 1s on the main diagonal and 0s everywhere else

Question: Given the following code snippet, how does the set.seed function affect the sample_frac function?

sample_frac will fail if set.seed is not called first

sample_frac performs faster

sample_frac will select random rows each time it is run

sample_frac will always select the same rows for a given seed

Ans:- sample_frac will always select the same rows for a given seed

Question: What type of diagram is used to analyse a hierarchical cluster?

Line chart

Dendrogram

Scatter plot

Histogram

Ans:-Dendrogram

Question: What is the main difference between lists and vectors in R?

A list must contain elements of the same class type

A list must have named members

A list can contain elements of different classes

A list maintains a sorted order

Ans:- A list can contain elements of different classes

Question: Why is velocity important?

Organizations operate on their own schedules

How quickly data’s processed is unimportant

Velocity means fast

Customers usually ‘want it now’

Ans:- Customers usually ‘want it now’

Question: Why is structure so important to variety?

We never had structured data

Structured data lacks rules

Most modern data is unstructured

Unstructured data is organized

Ans:- Most modern data is unstructured

Question: Validity and Volatility are linked to which V?

Velocity

Veracity

Variety

Volume

Ans:- Veracity

Question: What does the k in k-means clustering refer to?

The number of columns of data

The number of different labels the data set contains

The number of clusters

The number of rows in the cluster data

Ans:- The number of clusters

Question: Match the R plot method parameter with its aes function equivalent.

Answer Options:
A:pch
B:bg

color

A

B
Ans:-B

shape

A

B
Ans:- A

Question: What do the points plotted on a box-and-whisker plot indicate?

Interquartile range

Mean values

Outliers

Median values

Ans:-Outliers

Question: Which is not an example of finding value in Big Data?

Understanding relationships between four V’s

Better insights into customer needs

Discarding old sensor data

Avoiding business disruption

Ans:- Discarding old sensor data

Question: How many distributions is an outcome within 98% of the mean?

1

2

3

4

Ans:- 3

Question: What does the k in k-means clustering refer to?

The number of columns of data

The number of different labels the data set contains

The number of clusters

The number of rows in the cluster data

Ans:- The number of clusters

Question: Match the R plot method parameter with its aes function equivalent.

Answer Options:
A:pch
B:bg

color

A

B
Ans:-B

shape

A

B
Ans:- A

Question: What do the points plotted on a box-and-whisker plot indicate?

Interquartile range

Mean values

Outliers

Median values

Ans:-Outliers

Question: How many variables can be described in a two-dimensional colored bubble plot?

2

4

5

3

Ans:- 4

Question: Select the JavaScript library that is used to create interactive plots through a web browser.

pip

Vue.js

d3.js

jQuery

Ans:-d3.js

How many rows and columns will the following table function return in its result?

table(c(1,0,1), c(1,2,3))

1 column, 9 rows

9 rows, 1 column

3 rows, 3 columns

2 rows, 3 columns

3 rows, 2 columns

Ans:-2 rows, 3 columns

What data type does ggplot expect?

table

timeseries

vector

data frame

Ans:-data frame

Question: Match the visualization library with its canonical programming environment.

Answer Options:
A:Gnuplot
B:Ggplot2
C:Matplotlib

Shell (command line)

A

B

C

Ans:-A

Python

A

B

C

Anc:-C

R

A

B

C

Ans:-B

Question: Which of the following options are included in Hill’s criteria for causation?

Reproducibility

Authority

History

Education

Effect size

Temporality

Ans:-

Reproducibility

Effect size

Temporality

How many bins will be defined for the following argument to a geom_histogram plot using ggplot?

breaks=seq(1.5, 5.5, by = 0.5)

7

8

10

9

Ans:-8

Question: What is the main source of correlation errors explained by Simpson’s Paradox?

Dimension reduction

Training data

Confounding variables

Validation data

Ans:-Confounding variables

How is a bubble plot different from a scatter plot?

A scatter plot visualizes more information

A bubble plot has different sized points

A scatter plot has different sized points

A scatter plot can use color

Ans:- A bubble plot has different sized points

Question: Match the term with the scenario for its appropriate use.

Answer Options:
A:Jargon
B:Layman terms

Academic paper

A

B
Ans:-B

Public lecture

A

B
Ans:- B

Company meeting

A

B
Ans:- A

Blog post

A

B
Ans:- A

Question: What type of unclean data refers to data in varying units of measurement?

Inaccurate Data

Inconsistent Data

Erroneous Data

Missing Data

Non-standard Data

Ans:- Non-standard Data

Question: Given the following code, what will be the dimensions of tbl_final?

library(tidyverse)
tbl_test <- tibble(ID = c(1,1,2,2,3,3), name = c(‘name’, ‘year’, ‘name’, ‘year’, ‘name’, ‘year’), value = c(‘Steve’, 1897, ‘Bob’, 2001, ‘Jane’, 1991)) tbl_final <- tbl_test %>% spread(name,value)

12 x 2

3 x 3

6 x 3

3 x 6

2 x 12

Ans:- 3 x 3

Question: Select the characteristics of informal communication in data science.

Tabular data

Personal anecdotes

Storytelling

Annotated algorithms

Scatter plots

Appropriate layman’s terms

Ans:-

Personal anecdotes

Storytelling

Appropriate layman’s terms

Given the following regular expression substitution, select the description of the result.

gsub(“(^’)|(‘$)”, “”, var)

Any characters not enclosed in single quotes get removed

All single quotes get removed

Single quotes at the beginning and end of the string get removed

All characters between single quotes get removed

Ans:- Single quotes at the beginning and end of the string get removed

Which functions can be used to test a data set for missing or NA values?

is.na

testNA

anyNA

tryCatch

findNA

Ans:-

is.na

anyNA

Question: What are some important data science strategies that are shared with software development?

Mathematical rigor

Statistical modeling

Continuous integration

Code annotation

Model versioning

Version control

Ans:-

Continuous integration

Code annotation

Version control

By default, what sheet is loaded by read_excel?

most recently viewed

last

first

most recently created

Ans:- first

Select examples of aggregate functions used in conjunction with group_by.

as.numeric

mean

select

min

mutate

max

Ans:-

mean

min

max

When fetching a document over HTTP, what function can be used to check for errors?

glimpse

exit

tryCatch

anyNA

Ans:-tryCatch

What argument passed to dbFetch will request that all rows be returned?

n=-1

n=NULL

n=0

n=1

Ans:- n=-1

Match the data quality element with its criteria.

Instruction: Match each answer with the correct target. Each answer can only be used once.
Answer Options:
A:descriptive statistics
B:recency
C:cross reference
D:numeric range

constraints

A

B

C

D
Ans:- D

consistency checks

A

B

C

D
Ans:- C

validity

A

B

C

D
Ans:- B

data profiling

A

B

C

D
Ans:- A

Question: How many variables can be described in a two-dimensional colored scatter plot?

3

2

5

4

Ans: 3

Question: Select the answer that best describes when to use a line plot.

When the data is random

When the data is clustered

When the data is spaced linearly

When the data is unstructured

Ans:- When the data is spaced linearly

What function is used to split a column into two or more columns based on a delimiter?

separate

spread

mutate

tokenize

split

Ans:- separate

Given a left join, if a corresponding record does not exist in the right table, what happens to the joined values?

The join results in an empty table

The missing values are replaced with NA

The missing values are replaced with NULL

The missing values are replaced with 0 or “”

The record is dropped

Ans:- The missing values are replaced with NA

Question: Select the answer that best describes when to use a bar chart.

When the data is continuous

When the data is random

When the data is clustered

When the data is categorical

When the data is unstructured

Ans:- When the data is categorical

Question: In the context of Neural Networks, which of these statements correctly describe a fully connected layer?

A. Every neuron in this layer takes its input from all the neurons in the previous layer

Output of every neuron in this layer is fed as input to exactly one neuron in the next layer

Output of every neuron in this layer is fed as input to a specific neuron in the next layer

Every neuron in this layer takes its input from exactly one neuron in the previous layer

Ans:- A. Every neuron in this layer takes its input from all the neurons in the previous layer

Question: What would be the activation function of a Neural Network that is made to perform linear regression on input data?

identity

logit

ReLU

tanh

Ans:- identity

Question: Select the answer that best describes when to use a bar chart.

When the data is categorical

When the data is continuous

When the data is unstructured

When the data is clustered

Ans:- When the data is continuous

Question: Match following statements about neurons in the context of machine learning with their correct Boolean values.


Answer Options:
A:A neuron can output multiple different values
B:A neuron can consist of only one function, a linear one
C:Every connection between two neurons has a weight W associated with it
D:A neuron is a mathematical function that can take multiple inputs and outputs a single value

False

A

B

C

D
Ans:- A,B
True

A

B

C

D
Ans:- C,D

Question: Match the following Neural Networks terms with their corresponding definition

Answer Options:
A:Gradient Descent Optimization
B:Epoch
C:Learning Rate
D:Batch Size

The iterative process of adjusting the model parameters to minimize the loss

A

B

C

D
Ans:- A

Question: Match the visual element of the box plot with the data that it represents.

Answer Options:
A:Box height
B:Line
C:Whiskers
D:Points

Median

A

B

C

D
Ans:-B

Smallest and largest non-outliers

A

B

C

D
Ans:- C

First to third quartile range

A

B

C

D
Ans:- A

Outliers

A

B

C

D
Ans:- D

Question: Select the elements that make up a graph in a network visualization.

Edges

Nodes

Dendrites

Weights

Perceptrons

Ans:-

Edges

Nodes

Weights

Question: What values determine the critical value for a hypothesis test?

Sample mean

Sample standard deviation

Confidence level

Sample size

Degrees of freedom

Ans:-

Confidence level

Degrees of freedom

Question: Select the values used for a chi-square goodness-of-fit test.

Mean and confidence level

Median and standard deviation

Observed and expected frequencies

Prior and posterior probability

Ans:- Observed and expected frequencies