大家好,我想请问一下:
如何统计数据集中某个词的总频数? 这个词可能出现在任何一个column.
比如下面这个数据集:
Film | Director | Producer | Starring | Release |
Fight Club | David Fincher | Art Linson | Brad Pitt | 1999 |
The Departed | | Brad Pitt | Leonardo DiCaprio | 2006 |
World War Z | Marc Forster | Brad Pitt | Brad Pitt | 2013 |
12 Years a Slave | Steve McQueen | Brad Pitt | Chiwetel Ejiofor | 2013 |
The Aviator | Martin Scorsese | Michael Mann | Leonardo DiCaprio | 2004 |
Blood Diamond | Edward Zwick | | Leonardo DiCaprio | 2006 |
The Wolf of Wall Street | | Martin Scorsese | Leonardo DiCaprio | 2013 |
Brad Pitt | Moneyball | 2011 | | |
Sleepy Hollow | Tim Burton | Scott Rudin | Johnny Depp | 1999 |
Pirates of the Caribbean: Dead Man's Chest | Gore Verbinski | Jerry Bruckheimer | Johnny Depp | 2006 |
如何统计 “Brad Pitt” 这个词的出现次数?
列的名字是没有规律的, 观测数是不确定的
谢谢!
p.s. 上面例子的CSV数据集在附件(修改txt为csv即可)