全部版块 我的主页
论坛 计量经济学与统计论坛 五区 计量经济学与统计软件 LATEX论坛
2015-8-28 09:08:18
We consider the problem of recovery of an unknown multivariate signal $f$ observed in a $d$-dimensional Gaussian white noise model of intensity $\varepsilon$. We assume that $f$ belongs to a class of smooth functions ${\cal F}^d\subset L_2([0,1]^d)$ and has an additive sparse structure determined by the parameter $s$, the number of non-zero univariate components contributing to $f$. We are interested in the case when $d=d_\varepsilon \to \infty$ as $\varepsilon \to 0$ and the parameter $s$ stays "small" relative to $d$. With these assumptions, the recovery problem in hand becomes that of determining which sparse additive components are non-zero. Attempting to reconstruct most non-zero components of $f$, but not all of them, we arrive at the problem of almost full variable selection in high-dimensional regression. For two different choices of ${\cal F}^d$, we establish conditions under which almost full variable selection is possible, and provide a procedure that gives almost full variable selection. The procedure does the best (in the asymptotically minimax sense) in selecting most non-zero components of $f$. Moreover, it is adaptive in the parameter $s$.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-28 09:13:46
We study graphons as a non-parametric generalization of stochastic block models, and show how to obtain compactly represented estimators for sparse networks in this framework. Our algorithms and analysis go beyond previous work in several ways. First, we relax the usual boundedness assumption for the generating graphon and instead treat arbitrary integrable graphons, so that we can handle networks with long tails in their degree distributions. Second, again motivated by real-world applications, we relax the usual assumption that the graphon is defined on the unit interval, to allow latent position graphs where the latent positions live in a more general space, and we characterize identifiability for these graphons and their underlying position spaces.

We analyze three algorithms. The first is a least squares algorithm, which gives an approximation we prove to be consistent for all square-integrable graphons, with errors expressed in terms of the best possible stochastic block model approximation to the generating graphon. Next, we analyze a generalization based on the cut norm, which works for any integrable graphon (not necessarily square-integrable). Finally, we show that clustering based on degrees works whenever the underlying degree distribution is absolutely continuous with respect to Lebesgue measure. Unlike the previous two algorithms, this third one runs in polynomial time.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 17:37:02
library(tm)library(SnowballC)library(wordcloud)jeopQ <- read.csv('JEOPARDY_CSV.csv', stringsAsFactors = FALSE)
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 17:38:32
(This article was first published on Thinking inside the box , and kindly contributed to R-bloggers) A pure maintenance release 0.1.3 of the RcppDE package arrived on CRAN yesterday. RcppDE is a "port" of DEoptim, a popular package for derivative-free optimisation using differential optimization,...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 17:39:11
(This article was first published on Working With Data » R, and kindly contributed to R-bloggers) This entry is part 17 of 17 in the series Using R Sometimes it is useful to write a wrapper function for an existing function. In this short example we demonstrate how to grab the list of arguments...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 17:53:47
美国国家卫生研究院一项新研究显示,Omega-3补充剂并不能延缓老年人的认知衰退。国家卫生研究院进行的一项新研究却找到迄今最有力的证据证明,Omega-3补充剂对阿尔茨海默症和其他认知衰退问题没有效果。科学家用5年时间对4000名老年人进行了跟踪研究,结果鱼油根本没有任何用。
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 18:13:02
我们不提供人格教育、历史教育、理想教育
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-29 18:26:13
Lists are a data type in R that are perhaps a bit daunting at first, but soon become amazingly useful. They are especially wonderful once you combine them with the powers of the apply() functions. This post will be part 1 of a two-part series on the uses of lists. In this post, we will discuss the basics - how to create lists, manipulate them, describe them, and convert them. In part 2, we’ll see how using lapply() and sapply() on lists can really improve your R coding.

Constructing a list

Let’s start with what lists are and how to construct them. A list is a data structure that can hold any number of any types of other data structures. If you have vector, a dataframe, and a character object, you can put all of those into one list object like so:

# create three different classes of objects
vec <- 1:4
df <- data.frame(y = c(1:3), x = c("m", "m", "f"))
char <- "Hello!"

# add all three objects to one list using list() function
list1 <- list(vec, df, char)

# print list
list1
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
##   y x
## 1 1 m
## 2 2 m
## 3 3 f
##
## [[3]]
## [1] "Hello!"
We can also turn an object into a list by using the as.list() function. Notice how every element of the vector becomes a different component of the list.

# coerce vector into a list
as.list(vec)
## [[1]]
## [1] 1
##
## [[2]]
## [1] 2
##
## [[3]]
## [1] 3
##
## [[4]]
## [1] 4
Manipulating a list

We can put names on the components of a list using the names() function. This is useful for extracting components. We could have also named the components when we created the list. See below:
# name the components of the list
names(list1) <- c("Numbers", "Some.data", "Letters")
list1
## $Numbers
## [1] 1 2 3 4
##
## $Some.data
##   y x
## 1 1 m
## 2 2 m
## 3 3 f
##
## $Letters
## [1] "Hello!"
# could have named them when we created list
another.list <- list(Numbers = vec, Letters = char)
Extract components from a list (many ways): the first is using the [[ ]] operator (notice two brackets, not just one). Note that we can use the single [ ] operator on a list, but it will return a list rather than the data structure that is the component of the list, which is normally not what we would want to do. See what I mean here:
# extract 3rd component using [[]] -> this returns a *string*
list1[[3]]
## [1] "Hello!"
# print a list containing the 3rd component -> this returns a *list*
list1[3]
## $Letters
## [1] "Hello!"
It is also possible to extract components using the component’s name as we see below. Again, be careful about the [ ] vs [[ ]] operator in the second way. You need the [[ ]] to return the data structure of the component.

# extract 3rd component using $
list1$Letters
## [1] "Hello!"
# extract 3rd component using [[ ]] and the name of the component
list1[["Letters"]]
## [1] "Hello!"
Subsetting a list - use the single [ ] operator and c() to choose the components
# subset the first and third components
list1[c(1, 3)]
## $Numbers
## [1] 1 2 3 4
##
## $Letters
## [1] "Hello!"
We can also add a new component to the list or replace a component using the $ or [[ ]] operators. This time I’ll add a linear model to the list (remember we can put anything into a list).
# add new component to existing list using $
list1$newthing <- lm(y ~ x, data = df)

# add a new component to existing list using [[ ]]
list1[[5]] <- "new component"
Finally, we can delete a component of a list by setting it equal to NULL.
# delete a component of existing list
list1$Letters <- NULL
list1
## $Numbers
## [1] 1 2 3 4
##
## $Some.data
##   y x
## 1 1 m
## 2 2 m
## 3 3 f
##
## $newthing
##
## Call:
## lm(formula = y ~ x, data = df)
##
## Coefficients:
## (Intercept)           xm  
##         3.0         -1.5  
##
##
## [[4]]
## [1] "new component"
The Letters component is gone, so there are now only 4. Notice how the 4th component doesn’t have a name because we didn’t assign it one when we added it in.

More extracting: If we want to extract the dataframe we have in the list, and just look at it’s first row, we would do list1[[2]][1,]. This code would take the second component in the list using the [[ ]] operator (which is the dataframe) and once it has the dataframe, it subsets the first row and all columns using only the [ ] operator since that is what is used to subset dataframes (or matrices).

For help on subsetting matrices and dataframes, check out this post.

# extract first row of dataframe that is in a list
list1[[2]][1, ]
##   y x
## 1 1 m
Describing a list

To describe a list, we may want to know the following:

the class of the list (which is a list class!) and the class of the first component of the list.
# describe class of the whole list
class(list1)
## [1] "list"
# describe the class of the first component of the list
class(list1[[1]])
## [1] "integer"
the number of components in the list - use the length function()
# find out how many components are in the list
length(list1)
## [1] 4
a short summary of each component in the list - use str(). (I take out the model because the output is really long)
# take out the model from list and then show summary of what's in the list
list1$newthing <- NULL
str(list1)
## List of 3
##  $ Numbers  : int [1:4] 1 2 3 4
##  $ Some.data:'data.frame':   3 obs. of  2 variables:
##   ..$ y: int [1:3] 1 2 3
##   ..$ x: Factor w/ 2 levels "f","m": 2 2 1
##  $          : chr "new component"
Now we can combine these functions to append a component to the end of the list, by assigning it to the length of the list plus 1:
# construct new list of two components
new.list <- list(vec, char)

# notice that it has two components
length(new.list)
## [1] 2
# append a component to the end and print
new.list[[length(new.list) + 1]] <- "Appended"

new.list
## [[1]]
## [1] 1 2 3 4
##
## [[2]]
## [1] "Hello!"
##
## [[3]]
## [1] "Appended"
Notice you could keep doing this as the length is now 3. You could also use the $ operator to name a new component and that would append it at the end, as we saw above.

Initializing a list

To initialize a list to a certain number of null components, we use the vector function like this:

# initialize list to have 3 null components and print
list2 <- vector("list", 3)
list2
## [[1]]
## NULL
##
## [[2]]
## NULL
##
## [[3]]
## NULL
Converting a list into matrix or dataframe

Finally, we can convert a list into a matrix, dataframe, or vector in a number of different ways. The first, most basic way is to use unlist(), which just turns the whole list into one long vector:

# convert to one long string - use unlist
unlist(list1)
##        Numbers1        Numbers2        Numbers3        Numbers4
##             "1"             "2"             "3"             "4"
##    Some.data.y1    Some.data.y2    Some.data.y3    Some.data.x1
##             "1"             "2"             "3"             "2"
##    Some.data.x2    Some.data.x3                 
##             "2"             "1" "new component"
But often we have matrices or dataframes as components of a list and we would like to combind them or stack them into one dataframe. The following shows the two good ways I’ve found to do this (from this StackOverflow page) using ldply() from the plyr package or rbind().

First, we create a list of matrices and then convert the list of matrices into a dataframe.

#create list of matrices and print
mat.list <- list(mat1=matrix(c(1,2,3,4), nrow=2), mat2=matrix(c(5,6,7,8), nrow=2))
mat.list
## $mat1
##      [,1] [,2]
## [1,]    1    3
## [2,]    2    4
##
## $mat2
##      [,1] [,2]
## [1,]    5    7
## [2,]    6    8
#convert to data frame
#1. use ldply
require(plyr)
ldply(mat.list, data.frame)
##    .id X1 X2
## 1 mat1  1  3
## 2 mat1  2  4
## 3 mat2  5  7
## 4 mat2  6  8
#2. use rbind
do.call(rbind.data.frame, mat.list)
##        V1 V2
## mat1.1  1  3
## mat1.2  2  4
## mat2.1  5  7
## mat2.2  6  8
Get ready for part 2 next time, when we’ll see what we can use lists for and why we should use them at all.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-30 13:03:10
  logical, integer, double, complex, character, or raw.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-30 13:29:12
The TABULATE procedure displays descriptive statistics in tabular format, using some or all of the variables in a data set. You can create a variety of tables ranging from simple to highly customized.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-30 14:15:33
Trump says he'll decide 'very soon' on whether to rule out independent bid - Washington Post
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-30 17:15:30
You could also use the $ operator to name a new component and that would append it at the end, as we saw above.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-30 17:52:21
Knowing the basics, in this post, we’ll use the apply() functions to see just how powerful working with lists can be. I’ve done two posts on apply() for dataframes and matrics, here and here, so give those a read if you need a refresher.

Intro to apply-based functions for lists
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-31 08:30:13
A Picture of a Headlock That's Worth a Thousand Words - Haaretz
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-31 08:36:21
Slain Virginia reporter's father vows to fight for gun control - CNN
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-8-31 08:38:23
Dominica Declares Disaster Status After Storm Leaves 20 Dead - New York Times
Dominica Declares Disaster Status After Storm Leaves 20 DeadNew York TimesROSEAU, Dominica — Rescue teams worked Sunday to reopen roads to remote communities in Dominica after Tropical Storm Erika caused flooding and mudslides that killed at least 20 people and left more than 50 missing on the Caribbean...
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 13:50:06
ABC model choice via random forests
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:18:21
An 11-year-old left at home to defend himself and his 4-year-old sister staved off several home invasion attempts before finally shooting and killing a 16-year-old intruder, police say. Police officers arrived after 2 p.m.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:19:07
It turns out people will need a ticket to get close to Pope Francis during his Sept. 27 Mass on the Benjamin Franklin Parkway in Philadelphia. And “close” is a relative term.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:20:33
WASHINGTON Saudi King Salman will meet with U.S. President Barack Obama in Washington on Friday to seek more support in countering Iran, as the Obama administration aims to use the visit to shore up relations after a period of tensions.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:21:09
(Corrects headline to note the hearing on corruption charges is pending). * Perez resigns days before presidential election. * Faces claims of involvement in 'La Linea' customs scandal. * Scandal has gutted government, triggered protests.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:22:15
Players and attendees at the U.S. Open are used to reminders that flying objects are around; usually, though, they come in the form of noise from planes taking off from nearby La Guardia airport.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:23:45
A coalition of women's health-care providers on Thursday asked the Supreme Court to review a federal appeals court decision that would shutter all but a handful of abortion clinics in Texas.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:24:24
Basic Statistics in Multivariate Analysis (Pocket Guides to Social Work Research Methods) by Karen A. Randolph and Laura L. Myers
English | 2013 | ISBN: 0199764042 | 224 pages | PDF | 1 MB
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:30:32
For those who aren’t yet familiar with R, let’s begin with a quick overview. To start, R is a fascinating programming language, one that has recently become an appealing skill to add to your resume. That’s partly because the language has grown significantly in popularity; it’s now used in a range of professions including software development, business analysis, statistical reporting and scientific research. It’s more likely than ever that you’ll encounter R in your organization — and you’ll probably even find reasons to use it yourself.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:31:28
If you need proof, look no further than R’s growth, which is reflected in a number of independent lists; it has bounced around in the top 20 languages in the Tiobe Index of Programming Language Popularity for the last several years. In 2015, IEEE listed R at 6 in the top 10 languages of 2015. Additionally, as the amount of data-intensive work increases, the demand for tools like R for processing, data-mining and visualization will also increase.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:32:05
R originated as an open-source version of the S programming language in the 90s. Since then, it has gained the support of a number of companies, most notably RStudio and Revolution Analytics which created tools, packages, and services related to the language. But it isn’t limited to these more specialized companies; R also has support from large companies that power some of the largest relational databases in the world. Oracle, for one, has incorporated R into its offerings. Earlier this year Microsoft acquired Revolution Analytics and is including the language in SQLServer 2016.  SQLServer administrators and .NET developers now have R at their fingertips, installed with their standard platform tools.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:32:44
R in higher education
Here’s a fun fact: R originated in academia. Ross Ihaka and Robert Gentleman at the University of Auckland in New Zealand created it, and it’s been widely adopted in graduate programs that include intensive statistical study. R has also been used in Massive open online courses (MOOCs) such as the Coursera Data Science Program and in courses here at Pluralsight (including my own on R and RStudio). Folks taking graduate studies that involve crunching data are bound to encounter R, and like many other technologies, its introduction in schools leads naturally to its wider adoption in industry. R’s presence in higher education is confirmation of the demand for these skills in business settings.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:33:25
R is profitable
Technology is fun, sure, but most of us who enjoy it also do it for a living. Fortunately, R is not only a pleasure to use, but its demand in business often equates to higher salaries for its practitioners. The Dice Technology Salary Survey conducted last year ranked R as a highest-paying skill. The most recent O’Reilly Data Science Salary Survey also includes R among the skills used by the highest paid data scientists.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

2015-9-4 16:35:56
The R community is diverse, with many individuals coming from unique professional backgrounds. This list includes academics, scientists, statisticians, business analysts and professional programmers, among others. CRAN, the comprehensive R Archive Network, maintains packages created by community members that reflect this colorful background. Packages exist to perform stock market analysis, create maps, engage in high-throughput genomic analysis and do natural language processing.  This is only the tip of the iceberg; over 7000 packages are available on CRAN as of this writing. Additionally, R-Bloggers is a blog-aggregation site that serves as a hub for news related to the R community.
二维码

扫码加我 拉你入群

请注明:姓名-公司-职位

以便审核进群资格,未注明则拒绝

栏目导航
热门文章
推荐文章

说点什么

分享

扫码加好友,拉您进群
各岗位、行业、专业交流群