citation-network
$
npx mdskill add aipoch/medical-research-skills/citation-networkConvert citation pairs into visual networks for rapid literature analysis.
- Identify influential papers and research communities from citation data.
- Depends on CSV input with source and target citation relationships.
- Uses de-duplication by DOI or title to build accurate directed graphs.
- Delivers interactive HTML views and Gephi-compatible graph files.
SKILL.md
.github/skills/citation-networkView on GitHub ↗
--- name: citation-network description: Build and visualize a citation network from a source/target CSV to identify key papers, communities, and emerging hotspots; use when you have citation pairs and need fast literature review or trend analysis. license: MIT author: aipoch --- > **Source**: [https://github.com/aipoch/medical-research-skills](https://github.com/aipoch/medical-research-skills) ## When to Use - You have a citation relationship table (who cites whom) and want to quickly turn it into a directed network for analysis. - You are conducting a literature review and need to identify influential papers (high in-degree / centrality) and core clusters. - You want to detect community structures (research subfields) and compare them across time or datasets. - You need an interactive, shareable visualization (HTML) or a Gephi-importable graph file (GEXF). - You are positioning a new project and want evidence of research hotspots and bridging papers between communities. ## Key Features - Builds a directed citation graph from a minimal CSV containing `source` and `target`. - De-duplicates nodes by identifier (DOI recommended; otherwise unique titles). - Exports: - `citation_network.gexf` for Gephi and other graph tools - `network_metrics.json` for basic network statistics - `citation_network.html` for interactive browser viewing (auto-generated by the build script) - Run-directory workflow to keep each execution reproducible and isolated under `outputs/runs/<timestamp>/`. - Optional input encoding control to avoid garbled characters (e.g., UTF-8 / UTF-8-SIG). ## Dependencies - Python 3.10+ - pandas >= 2.0 - networkx >= 3.0 - (Optional, for HTML visualization) pyvis >= 0.3 ## Example Usage ### 1) Initialize a run directory ```bash python scripts/init_run.py ``` This creates a new run folder: ```text outputs/runs/<timestamp>/ config.json data/ outputs/ ``` ### 2) Prepare the citation CSV (minimal) Create `citations.csv` and place it into: ```text outputs/runs/<timestamp>/data/citations.csv ``` Minimal CSV format: ```csv source,target Paper A,Paper B Paper A,Paper C ``` Recommended DOI-based identifiers: ```csv source,target 10.1234/abcd.1,10.1234/abcd.2 10.1234/abcd.1,10.1234/abcd.3 ``` ### 3) Confirm configuration Open: ```text outputs/runs/<timestamp>/config.json ``` Ensure the configured input filename and column names match your CSV (at minimum `source` and `target`). If you see garbled characters, set an explicit encoding (e.g., `utf-8` or `utf-8-sig`) via an `input_encoding` field if supported by the config. ### 4) Build the citation network ```bash python scripts/build_citation_network.py ``` The build script will also generate the HTML automatically (you do not need to run `scripts/export_gexf_html.py` manually). ### 5) Inspect outputs Expected outputs under the same run directory: - `citation_network.gexf` (import into Gephi) - `network_metrics.json` (node/edge counts, density, etc.) - `citation_network.html` (open in a browser) ## Implementation Details ### Data Model - **Nodes**: papers, identified by the value in `source`/`target` (DOI preferred; otherwise a unique, consistent title string). - **Edges**: directed citations `source -> target`. ### Input Requirements and Constraints - The network builder reads **only** the `source` and `target` columns. - Additional columns (e.g., author/year/venue) are ignored by the current scripts. - If you need metadata, maintain a separate table for downstream joining/annotation (not consumed by the builder), for example: ```csv id,title,authors,year,doi 10.1234/abcd.1,Paper A,"Zhang, Wei; Li, Ming",2021,10.1234/abcd.1 10.1234/abcd.2,Paper B,"Wang, Fang",2019,10.1234/abcd.2 ``` ### Run Directory Standard - Always run `python scripts/init_run.py` before an execution to create a new run directory. - All inputs, configs, and outputs must remain inside `outputs/runs/<timestamp>/`. - By default, scripts operate on the latest run directory under `outputs/runs/`. ### Metrics and Analysis (Conceptual) - Basic network statistics are exported to `network_metrics.json` (e.g., node/edge counts, density). - Typical downstream analyses include: - centrality (degree, betweenness) - community detection (e.g., Louvain), if enabled/implemented in the pipeline ### Common Failure Modes - **Garbled characters**: ensure CSV is UTF-8/UTF-8-SIG; set `input_encoding` in `config.json` if available. - **Duplicate nodes**: identical identifiers are treated as the same node; prefer DOIs or enforce unique titles. - **Empty or missing output**: verify the CSV header names match the configured `source`/`target` columns. ### Related References - Data cleaning checklist: `references/data-cleaning-checklist.md` - Network metrics notes: `references/network-metrics-notes.md` - Additional documentation: `references/README.md`
More from aipoch/medical-research-skills
- 3d-molecule-ray-tracerGenerate photorealistic rendering scripts for PyMOL and UCSF ChimeraX.
- abstract-summarizerTransform lengthy academic papers into concise, structured 250-word abstracts.
- abstract-trimmerPrecision editing tool that reduces abstract word count through intelligent compression techniques, maintaining scientific rigor while meeting strict journal and conference requirements.
- academic-abstract-refinerRefines long medical academic texts into SCI-style unstructured Chinese and English abstracts; use when you need to condense drafts/reports/summaries into bilingual abstracts and generate Summary_Report.md.
- academic-cv-generatorGenerate structured academic CVs from free-form Chinese/English text and export to Word (.docx). Use this skill when you are asked to organize, generate, or optimize an academic CV (e.g., publications/projects/awards) into a consistent, formatted document with uniform-colored section headers and optional bilingual output.
- academic-highlight-generatorGenerates submission-ready Elsevier/SCI Highlights from manuscript text or extracted PDF/DOCX/TXT content. Use when a user needs 3-5 concise, evidence-grounded highlight bullets for a research paper, review, meta-analysis, case report, or bioinformatics manuscript.
- academic-norm-reviewDetects content similarity, verifies standardized citations and abbreviations, and flags potential academic integrity risks; use it before submission, during academic writing QA, or for compliance reviews.
- academic-poster-generatorComplete workflow for generating academic research posters from PDF literature; use when you need to extract paper content from PDFs and produce a LaTeX-based poster (beamerposter/tikzposter/baposter) with mandatory figure generation and a final rendered HTML deliverable.
- acronym-unpackerIntelligent medical abbreviation disambiguation tool that resolves ambiguous acronyms using clinical context, specialty-specific knowledge, and document-level semantic analysis.
- active-comparator-single-soc-faers-safety-comparisonGenerates complete FAERS pharmacovigilance study designs for multi-drug or class-level safety comparison inside one predefined SOC or AE family using active comparators, disproportionality analysis, subgroup characterization, and reviewer-facing evidence control.