Chat with us, powered by LiveChat ALY6100 Northeastern University Netflix Efficiency Plan Proposal Paper | Abc Paper

(1) Email-essay scenario: Your boss has tasked you with leading the data science team effort for this project. (Or, your team for the Netflix prize has put you in charge.) Last week, you worked on defining the project’s objectives and questions that need to be answered. Now it’s time to make a plan for the first two weeks of work, which will be focused on defining what data needs to be used to answer the business questions and reach the objectives, and gather that data. Your boss has asked you to send a proposed plan as an email, including: What datasets will be needed Why these datasets? How does the information that they contain inform the decision or answer business questions?Which datasets exist internally?If any datasets don’t already exist, specify how they will be collected.* Use your knowledge of the cases / how businesses work to imagine what likely exists already internally at Salesforce and Netflix. This week’s video “Delivering High Quality Analytics at Netflix” will give you a sense of what sorts of data exists at Netflix, and help you imagine what data may exist at Salesforce.Requirements:Minimum 300 wordsMinimum 2 references (can use book as a reference) with in line citations as appropriateReference list(3) Data description exerciseFor one dataset specified in your email, write up a partial data encyclopedia and dictionary. Examples of one dataset: SalesforceHistory of salaries and bonuses for each employeeNetflixAll customer ratings for each videoInclude (see Bartlett 12.2 for more details):Purpose of Dataset Source of datasetTime window (that the dataset represents)Cost of data (to the company)Collection techniques (see also Bartlett Chapter 10)Collection toolsQualityCompletenessFor each column in the datasetNameDefinitionVariable Classification (see Bartlett Table 12.1, p. 247)For any details you cannot find in the cases or through research, make up a reasonable description.Reading: for the exercise and last week’s work below:



Unformatted Attachment Preview

Data description exercise
ALY 6100
For this exercise, I will consider the S&P return data, presented in the “Data Collection
Methodology” video.
1. Purpose of Dataset – Record yearly returns for S&P index, to use to inform parameters
for a retirement savings model
2. Source of dataset – Downloaded from
3. Time window (that the dataset represents) – 1928-2017
4. Cost of data (to the company) – Free download from internet
5. Collection techniques – Yearly market returns were calculated from
6. Collection tools – See Collection techniques
7. Quality – No quality concerns, as this is exact data, but historical changes in markets
must be considered when using this data to forecast the markets
8. Completeness – Complete, all years from 1928-2017 included
Annual returns on investments in S&P 500,
including appreciation and dividends, in percent
Variable Classification
Email Scenario
In the Netflix Prize and recommendation problem, there is a need for accuracy by
reducing errors. Thus, there are two main objectives of this project. The first one is to educate the
employees about the causes of missing data in the system which reduce accuracy and the second
one is to find the solutions for the missing data (Amatriain & Basilico, 2012). These are
fundamental objectives due to several reasons. First, as employees, people get involved in
several actions which may lead to the loss of data in the process. Loss of data leads to system
errors which lead to poor performance or sometimes failure of the systems. Some of the causes
include programming errors, loss of data during the transfer process, failure of the user to fill in
some required fields, or even ignorance by the users due to personal beliefs about performance.
In this context, an organization is likely to run into significant losses. Can you imagine if the
company lost the details of all the clients due to programming error? It might have to spend lots
of resources to recover from the loss. Thus, under this goal, the data science team will know how
their actions and knowledge about data-driven decisions affect the welfare of the organization.
In the second goal, strategy formulation is an important thing to learn. People need the
data science team needs to know the best practices to keep the data free from errors. This
objective is derived from the first one. Having known the cause, why not formulate a strategy?
Knowing that a problem exists and dealing with it are two different things. Thus, this makes this
objective vital to this project.
There are several questions which the data science team should be able to respond to in
order to make sound recommendations in this project. These questions include:
1. From the previous errors, which kind of incidences in the organization led to their
2. Are there any identified ways to improve the efficiency of the project? If present, which
ones are they and how can they be of help?
3. Which technologies are at our exposure to help mitigate the risks we encounter on the
Amatriain. X., & Basilico. J. (2012, April 06). Netflix Recommendations: Beyond the 5 stars
(Part 1). Retrieved from
Amatriain. X., & Basilico. J. (2012, April 06). Netflix Recommendations: Beyond the 5 stars
(Part 2). Retrieved from
Short Answer
From the Netflix case, there was improved rating due to the use of data driven-driven
decisions. Thus, what the data-driven decision adds to the project is accuracy. When dealing with
figures, accuracy is the most important thing. Providing accurate figures is fundamental for the
analysis of the business’s position (Amatriain & Basilico, 2012). For instance, the Netflix case
talks of ranking which is a concept used by most online movie selling organizations. People need
to know the level of enjoyment they will get by subscribing to given movies. In order to get this
right, data-driven decisions need to be made. First, every client’s opinion is analyzed, and rated
to a chosen scale, for instance, a scale of 1-5 or 1-10. By selecting all the data involved, the true
reflection of the organization is produced. It is from truthful values that an individual is able to
make a successful move.
Accuracy is the most added value data-driven decisions possess (Yamin-Ali, 2014). The
best decisions are those ones based on facts. The reason why most resolutions fail to produce the
expected results correctly is due to the inaccuracies in them: they are based on assumptions. For
instance, if Netflix will assume that five hundred people out of a thousand who watched a given
movie from its website loved it, then it will be operating on a wrong basis. It can run into losses
because they will continue distributing the content without facts. What if just two hundred
people loved it yet the crew decided to make an assumption based on the analysis of the first fifty
people only and ignored to complete the results for the remaining five hundred and fifty? Datadriven decisions must be taken seriously for any omission will lead to great errors. For instance,
in the Netflix case study, the organization is able to improve on its ranking if accurate data is
filled by the data science team.
Finkelstein, S., Whitehead, J., & Campbell, A. (2009). Think Again: Why Good Leaders Make
Bad Decisions and How to keep it From Happening to You. Boston: Harvard Business
Review Press.
Yamin-Ali, J. (2014). Data-Driven Decision Making in Schools: Lessons from Trinidad.
Basingstoke: Palgrave Macmillan Limited.

Purchase answer to see full

error: Content is protected !!