json - R - streamR - Number of items to replace is not a multiple of replacement length -
i'm attempting use streamr package in r siphon random sample of tweets twitter streaming api. when use samplestream function , try pull results parsetweets, 2 confusing warning messages cannot interpret, indicate samplestream cuts mid-process.
these warnings receive. first 1 appears when specify number of tweets i'd pull in samplestream function. second 1 appears regardless:
> samplestream(file.name = "tweets.json", + timeout = 0, + tweets = 200, + oauth = twitter.oauth, + verbose = true) capturing tweets... connection twitter stream closed after 2 seconds 200 tweets downloaded. > tweets <- parsetweets(tweets = "tweets.json", + simplify = false, + verbose = true) 66 tweets have been parsed. warning messages: 1: in readlines(tweets) : incomplete final line found on 'tweets.json' 2: in vect[notnulls] <- unlist(lapply(lst[notnulls], function(x) x[[field[1]]][[field[2]]][[as.numeric(field[3])]][[field[4]]])) : number of items replace not multiple of replacement length
upon analysis of json, find last line (the last json object) in fact cut-off mid-line. appear samplestream isn't finishing job of pulling requested number of tweets. perhaps ties second warning message.
my script being authenticated use twitter api. key , secret have been substituted. authorization script works, included in case might related problem:
library(roauth) twitter.oauth <- oauthfactory$new(consumerkey = [key], consumersecret = [secret], requesturl = "https://api.twitter.com/oauth/request_token", accessurl = "https://api.twitter.com/oauth/access_token", authurl = "https://api.twitter.com/oauth/authorize") twitter.oauth$handshake(cainfo = system.file("curlssl", "cacert.pem", package = "rcurl")) save(twitter.oauth, file = "twitter_oauth.rda")
my script pulling sample of public tweets firehose:
library(streamr) load("twitter_oauth.rda") samplestream(file.name = "tweets.json", timeout = 0, tweets = 200, oauth = twitter.oauth, verbose = true) tweets <- parsetweets(tweets = "tweets.json", simplify = false, verbose = true)
Comments
Post a Comment