Skip to main content

HITS

SQL function: cugraph_hits

Compute HITS hub and authority scores.

Signature

cugraph_hits(table_name [, src_col, dst_col [, weight_col [, options_json]]])

Allowed argument counts: 1, 3, 4, 5.

Quickstart

SELECT * FROM cugraph_hits('target_edges')

Positional arguments

ArgumentTypeRequiredDefaultNotes
table_nameUtf8yes
src_colUtf8nosrc
dst_colUtf8nodst
weight_colUtf8|nullnoaccepted as an edge-column binding; native algorithm execution does not consume weights; semantic effect: none for this algorithm
options_jsonUtf8no

JSON options

OptionTypeDefaultConstraintsDescription
epsilonFloat640.00001> 0
max_iterationsUInt32100min 1
normalizeBooleanfalse

Graph construction options

Shared by all cuGraph functions, shown here with this function's defaults. The construction_policy option controls whether Nexus requests Python cuGraph-compatible edge normalization or bypasses it for raw libcugraph-style construction; see graph construction options for the full policy guide.

OptionTypeDefaultConstraintsDescription
construction_policyUtf8"python_cugraph"one of "python_cugraph", "raw_libcugraph"Edge-list construction semantics used before calling libcugraph.
directedBooleantrueWhether graph construction treats edges as directed.
renumberBooleantrueWhether graph construction may renumber external vertex identifiers internally.

Output schema

ColumnTypeNullableDescription
vertexInt64noVertex receiving HITS scores.
hub_scoreFloat64noHITS hub score for the vertex.
authority_scoreFloat64noHITS authority score for the vertex.
note

These are the generic registry schemas. Run cugraph_validate_call for the concrete, table-specific output schema of a particular call.

Examples

This example runs on the citation network demo dataset.

Hubs are surveys, authorities are foundations

HITS returns two scores per vertex in one pass. In a citation graph they separate two kinds of importance that PageRank blends: a hub cites many authorities (a good survey), an authority is cited by many hubs (a foundational result). One call, two ORDER BYs:

SELECT p.year, p.n_references, p.title
FROM cugraph_hits('citation_edges', 'src', 'dst') h
JOIN papers p ON p.paper_id = h.vertex
ORDER BY h.hub_score DESC
LIMIT 4;
yearn_referencestitle
2019292Deep Learning for Generic Object Detection: A Survey
2018276Deep Learning for Generic Object Detection: A Survey.
2015299Recent Advances in Convolutional Neural Networks
2019211Object Detection With Deep Learning: A Review
SELECT p.year, p.n_citation, p.title
FROM cugraph_hits('citation_edges', 'src', 'dst') h
JOIN papers p ON p.paper_id = h.vertex
ORDER BY h.authority_score DESC
LIMIT 4;
yearn_citationtitle
200435,541Distinctive Image Features from Scale-Invariant Keypoints
201418,029VERY DEEP CONVOLUTIONAL NETWORKS FOR LARGE-SCALE IMAGE RECOGNITION
201216,802ImageNet Classification with Deep Convolutional Neural Networks
200519,433Histograms of oriented gradients for human detection

The top hubs are literally titled "Survey" and "Review"; the top authorities are SIFT, VGG, AlexNet, HOG. The mutually-reinforcing definition lands both lists in the field with the densest hub/authority structure — computer vision — without being told any field labels.

Limitations & notes

  • dry-run validates table resolution, column presence, static dtypes, and options only
  • dry-run does not scan edge data, construct a graph, or prove source-vertex existence

Validate before running

Always dry-run a call before executing it. Validation checks the function, table, columns, dtypes, and options without touching the GPU:

SELECT * FROM cugraph_validate_call(
'cugraph_hits',
'your_edges_table',
'{"src_col":"src","dst_col":"dst"}'
);

See Discovery & validation for the full cugraph_validate_call contract.