Skip to content
New issue

Have a question about this project? # for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “#”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? # to your account

feat: view explore page trends and search for posts in a trend #234

Open
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

mika-jpd
Copy link

Add support for X's Explore Page trending content

This PR adds the ability to fetch trending keywords and hashtags from X's Explore page by querying X's GenericTimelineById endpoint.

Changes

twscrape.api

  • added the GQL ID and endpoint name
OP_Explore = "5u36Lskx1dfACjC_WHmH3Q/GenericTimelineById"
  • added methods to attach the appropriate request variables and features and parse trends
async def list_explore_raw(self, timeline_id: str, kv: dict = None)
async def list_explore(self, timeline_id: str = None, limit=-1, kv=None)
  • added a method to search for tweets belonging to a trend by passing "querySource": "trend_click" to twscrape.api.search_raw
async def search_trend(self, q: str, limit: int = -1, kv=None)

twscrape.utils

  • modified def to_old_rep(obj: dict) -> dict[str, dict] so that it can now parse TimelineTrend items; returns {"tweets": {**tw1, **tw2}, "users": users, "trends": trends}

twscrape.models

  • added methods to parse the response object returned by list_explore_raw and returns a Generator of TimelineTrend objects
def parse_trends(rep: httpx.Response, limit: int = -1) -> Generator[TimelineTrend, None, None]
def parse_trend(rep: httpx.Response) -> TimelineTrend | None
  • modified parse_items so it can now handle a response with TimelineTrend objects

  • created a new dataclass class TimelineTrend(JSONTrait)

Example Usage

async def list_trending(timeline: str = "trending") -> list[TimelineTrend]:
    timeline_to_id: dict = {
        "trending": "VGltZWxpbmU6DAC2CwABAAAACHRyZW5kaW5nAAA=",
        'news': "VGltZWxpbmU6DAC2CwABAAAABG5ld3MAAA==",
        'sports': "VGltZWxpbmU6DAC2CwABAAAABnNwb3J0cwAA",
        'entertainment': "VGltZWxpbmU6DAC2CwABAAAADWVudGVydGFpbm1lbnQAAA=="
    }
    id_ = timeline_to_id[timeline] if timeline in timeline_to_id else "VGltZWxpbmU6DAC2CwABAAAABG5ld3MAAA"
    trends = await gather(api.list_explore(timeline_id=id_))
    return trends


async def search_trending(q: str, product: str = "Top", limit: int = 10) -> list[Tweet]:
    kv: dict = {
        "product": product
    }
    tweets = await gather(
        api.search_trend(q=q, limit=limit, kv=kv)
    )
    return tweets


if __name__ == "__main__":
    # get trending from explore page
    trends = asyncio.run(list_trending("news"))

    # fetch a random trend
    trend: TimelineTrend = random.choice(trends)
    name: str = trend.name

    # fetch tweets
    tweets = asyncio.run(search_trending(q=name, product="Top", limit=25))
    pass

Notes

list_explore requires a timeline_id as GenericTimelineById allows searching for 4 different timelines categories: trending, news, sports and entertainment. Their various IDs are currently only listed in the timeline_to_id dictionary in the list_trending method. I think some mapper from timeline category to ID could be added to the list_explore method which could then take one of the 4 categories as a parameter. However, this seemed different from how the other methods operated so I decided not to impose this.

I can add this in a modification to this PR if needed !

…hods twscrape.api.list_explore and twscrape.api.list_explore_raw to prepare and query X's GenericTimelineById, parser methods twscrape.models.parse_trends and twscrape.models.parse_trend, a Timeline trend object and a twscrape.api.search_trend to search tweets in a trend with the appropriate search variables.
@mika-jpd mika-jpd requested a review from vladkens as a code owner February 25, 2025 23:43
# for free to join this conversation on GitHub. Already have an account? # to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant