zhjx19 发表于 2022-12-14 12:36 
逐元素for循环是早该彻底抛弃的思维,看看向量化、数据化编程思维有多么简洁、优雅:
library(tidyverse)
df = tibble(
x = c("黄渡敬老院,新黄路,嘉定区,上海市,201804,中国",
"伴亭路,九里亭街道,松江区,上海市,201101,中国",
"豫园,人民路,外滩街道,黄浦区,上海市,200010,中国",
"若瑟登36,四川南路,外滩街道,黄浦区,上海市,200002,中国",
"崇明区,上海市,中国",
"崇明区,上海市,中国"))
df %>%
mutate(区 = str_extract(x, "[\u4e00-\u9fa5]+区")) %>%
distinct(区)