全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 LATEX论坛
2015-10-11 07:47:52
This, as far as I’m concerned, is perfectly fine, especially since I agree with 98% of their views.
(My only quibble is around SQL—but that’s more an issue of my upbringing than of
disagreement.) What their unambiguous writing means is that you can focus on the craft and art
of data science and not be distracted by choices of which tools and methods to use. This
precision is what makes PDSwR practical. Let’s look at some specifics.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 07:49:29
Practical tool set: R is a given. In addition, RStudio is the IDE of choice; I’ve been using RStudio
since it came out. It has evolved into a remarkable tool—integrated debugging is in the latest
version. The third major tool choice in PDSwR is Hadley Wickham’s ggplot2. While R has
traditionally included excellent graphics and visualization tools, ggplot2 takes R visualization to
the next level. (My practical hint: take a close look at any of Hadley’s R packages, or those of his
students.) In addition to those main tools, PDSwR introduces necessary secondary tools: a
proper SQL DBMS for larger datasets; Git and GitHub for source code version control; and knitr
for documentation generation.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 07:53:08
Practical datasets: The only way to learn data science is by doing it. There’s a big leap from the
typical teaching datasets to the real world. PDSwR strikes a good balance between the need for a
practical (simple) dataset for learning and the messiness of the real world. PDSwR walks you
through how to explore a new dataset to find problems in the data, cleaning and transforming
when necessary.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 08:01:50
Practical human relations: Data science is all about solving real-world problems for your
client—either as a consultant or within your organization. In either case, you’ll work with a
multifaceted group of people, each with their own motivations, skills, and responsibilities. As
practicing consultants, Nina and John understand this well. PDSwR is unique in stressing the
importance of understanding these roles while working through your data science project.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 08:03:16
Practical modeling: The bulk of PDSwR is about modeling, starting with an excellent overview
of the modeling process, including how to pick the modeling method to use and, when done,
gauge the model’s quality. The book walks you through the most practical modeling methods
you’re likely to need. The theory behind each method is intuitively explained. A specific example
is worked through—the code and data are available on the authors’ GitHub site. Most
importantly, tricks and traps are covered. Each section ends with practical takeaways.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 09:34:22
The figure on the cover of Practical Data Science with R is captioned “Habit of a Lady of
China in 1703.” The illustration is taken from Thomas Jefferys’ A Collection of the
Dresses of Different Nations, Ancient and Modern (four volumes), London, published
between 1757 and 1772. The title page states that these are hand-colored copperplate
engravings, heightened with gum arabic. Thomas Jefferys (1719–1771) was called
“Geographer to King George III.” He was an English cartographer who was the leading
map supplier of his day. He engraved and printed maps for government and
other official bodies and produced a wide range of commercial maps and atlases,
especially of North America. His work as a mapmaker sparked an interest in local
dress customs of the lands he surveyed and mapped; they are brilliantly displayed in
this four-volume collection.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 09:36:42
The data scientist is responsible for guiding a data science project from start to finish.
Success in a data science project comes not from access to any one exotic tool,
but from having quantifiable goals, good methodology, cross-discipline interactions,
and a repeatable workflow.
This chapter walks you through what a typical data science project looks like:
the kinds of problems you encounter, the types of goals you should have, the tasks
that you’re likely to handle, and what sort of results are expected.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 09:55:55
In defining the roles here, we’ve borrowed some ideas from Fredrick
Brooks’s The Mythical Man-Month: Essays on Software Engineering (Addison-Wesley, 1995)
“surgical team” perspective on software development and also from the agile software
development paradigm
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 09:57:26
Role Responsibilities
Project sponsor Represents the business interests; champions the project
Client Represents end users’ interests; domain expert
Data scientist Sets and executes analytic strategy; communicates with sponsor and client
Data architect Manages data and data storage; sometimes manages data collection
Operations Manages infrastructure; deploys final project results
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 09:58:26
CLIENT
While the sponsor is the role that represents the business interest, the client is the role
that represents the model’s end users’ interests. Sometimes the sponsor and client
roles may be filled by the same person. Again, the data scientist may fill the client role
if they can weight business trade-offs, but this isn’t ideal.
The client is more hands-on than the sponsor; they’re the interface between the
technical details of building a good model and the day-to-day work process into which
the model will be deployed. They aren’t necessarily mathematically or statistically
sophisticated, but are familiar with the relevant business processes and serve as the
domain expert on the team. In the loan application example that we discuss later in
this chapter, the client may be a loan officer or someone who represents the interests
of loan officers.
As with the sponsor, you should keep the client informed and involved. Ideally
you’d like to have regular meetings with them to keep your efforts aligned with the
needs of the end users. Generally the client belongs to a different group in the organization
and has other responsibilities beyond your project. Keep meetings focused,
present results and progress in terms they can understand, and take their critiques to
heart. If the end users can’t or won’t use your model, then the project isn’t a success,
in the long run.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 11:04:27
The next role in a data science project is the data scientist, who’s responsible for taking
all necessary steps to make the project succeed, including setting the project strategy
and keeping the client informed. They design the project steps, pick the data
sources, and pick the tools to be used. Since they pick the techniques that will be
tried, they have to be well informed about statistics and machine learning. They’re
also responsible for project planning and tracking, though they may do this with a
project management partner.
At a more technical level, the data scientist also looks at the data, performs statistical
tests and procedures, applies machine learning models, and evaluates results—the
science portion of data science.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 11:06:22
The data architect is responsible for all of the data and its storage. Often this role is
filled by someone outside of the data science group, such as a database administrator
or architect. Data architects often manage data warehouses for many different projects,
and they may only be available for quick consultation.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 11:09:31
Let’s look at the different stages shown in figure 1.1. As a real-world example, suppose
you’re working for a German bank.1 The bank feels that it’s losing too much
money to bad loans and wants to reduce its losses. This is where your data science
team comes in.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 12:10:37
转个段子:
屠呦呦得奖时,主持人一读获奖者名字,当时全场都晕了:The winner is you,you too!主持人慌了,忙纠正说:“The winner is to you ,you”。全场哗然,都不知道到底是谁得奖了!主持人只好继续解释Her name in Chinese means kill and scream.全场再次哗然。上次莫言得奖是……shut up!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-11 12:11:47
古时候打劫是这样的:“此山是我开, 此树是我栽。要想从此过, 留下买路财!” 这语言多么粗鲁野蛮! 经过上千年的文明洗礼,到了现代社会,语言变得文明贴心: “前方500米收费站,请减速慢行,停车请领卡,请缴费,谢谢合作! 祝您出行平安……” 多么的文明有礼貌啊!
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-13 09:52:21
真的么?
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:09:10
Abstract
Never before have network airlines been so exposed and vulnerable to low-cost carriers (LCCs). While LCCs had 26.3% of all world seats in 2013, Southeast Asia had 57.7% and South Asia 58.4% – and these figures will only increase. There are many consequences of LCCs on network airlines, including inadequately meeting the expectations of customers, so increasing dissatisfaction, and not offering sufficient value-for-money. Clearly, it is fundamentally important for Asian network airlines to respond appropriately to LCCs. This paper looks at the strategic capability of 22 of the top Asian network airlines in competing with LCCs, which is achieved by analysing questionnaire data from these airlines in terms of 37 competitive responses across six distinct response categories. It is crucial to note that this paper only concerns their capability in competing with LCCs, and it does not consider their overall strength. This paper also investigates how strategic capability varies by Asian sub-region and by airline performance, with performance examined in two respects: by perceived performance and actual performance. The results show that strategic capability varies widely, with Vietnam Airlines possessing the strongest strategic capability to compete with LCCs and SilkAir the weakest. Of others that compete heavily with LCCs, Malaysia Airlines and Garuda Indonesia have strong capabilities, while Philippine Airlines does not. However, all three need to more forcefully respond to LCCs. As a whole, network airlines within Southeast Asia have the greatest strategic capability, and Northeast Asia the weakest. There is a reasonably strong correlation between strategic capability and both actual and perceived performance, which suggests that those airlines with strong strategic capabilities should achieve strong overall performance.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:09:43
Never before have network airlines been so exposed and vulnerable to low-cost carriers (LCCs).
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:10:16
While LCCs had 26.3% of all world seats in 2013, Southeast Asia had 57.7% and South Asia 58.4% – and these figures will only increase.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:11:01
There are many consequences of LCCs on network airlines, including inadequately meeting the expectations of customers, so increasing dissatisfaction, and not offering sufficient value-for-money.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:15:17
Clearly, it is fundamentally important for Asian network airlines to respond appropriately to LCCs.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:16:43
This paper looks at the strategic capability of 22 of the top Asian network airlines in competing with LCCs, which is achieved by analysing questionnaire data from these airlines in terms of 37 competitive responses across six distinct response categories.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:47:03
It is crucial to note that this paper only concerns their capability in competing with LCCs, and it does not consider their overall strength.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-14 08:56:36
This paper also investigates how strategic capability varies by Asian sub-region and by airline performance, with performance examined in two respects: by perceived performance and actual performance.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-16 16:33:03
A New Method for Statistical Disclosure Limitation, I
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-16 16:34:33
Let's face it, PowerPoint isn't going anywhere. Even if you use R (or Python) for data analysis, PowerPoint is how you distribute and communicate results, and learning how to create those decks as part of your R workflow can pay off. Beyond efficiency, and repeatability, programmatic access enables you to do things that just aren't possible with point and click.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-16 16:45:17
(Here's a short video of what we're aiming for, how it was created and the R code.)

In this talk, S Anand uses Python and pywin32 to create some jaw dropping effects in PowerPoint, scraping data from IMDB and creating a PowerPoint slide using the data. RDCOMClient by Duncan Temple Lang allows you to do the same thing using R. It provides the ability to "access and control applications such as Excel, Word, PowerPoint, Web browsers etc."
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-18 07:45:47
The balanced credibility estimators with correlation risk and inflation factor
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-18 07:46:39
Abstract
In classical credibility theory, claims are assumed to be independent over risks and the premiums are derived under squared loss functions. However, in many practical situations, the assumptions may be violated in some situations. Hence, this paper investigates the credibility estimators under balanced loss function with equal dependence structure among the individual risks and inflation factor. To be specific, the inhomogeneous and homogeneous credibility estimators are derived for Bühlmann–Straub credibility model.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-10-18 07:47:33
A course in credibility theory and its application
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群