Parameterized Text Indexing with One Wildcard

Arnab Ganguly,Wing-Kai Hon, Yu-An Huang,Solon P. Pissis,Rahul Shah,Sharma V. Thankachan

2019 Data Compression Conference (DCC)（2019）

引用 3|浏览47

暂无评分

摘要

Two equal-length strings X and Y over an alphabet Σ of size σ are a parameterized match iff X can be transformed to Y by renaming the character X[i] to the character Y[i] for 1 ≤ i ≤ |X| using a one-to-one function from the set of characters in X to the set of characters in Y. The parameterized text indexing problem is defined as: Index a text T of n characters over an alphabet set Σ of size σ, such that whenever a pattern P[1, p] comes as a query, we can report all occ parameterized occurrences of P in T. A position i ϵ [1, n] is a parameterized occurrence of P in T, iff P and T[i,(i+p-1)] are a parameterized match. We study an interesting generalization of this problem, where the pattern contains one wildcard character φ ∉ Σ that matches with any other character in Σ. Therefore, for a pattern P[1, p] = P ₁ φP ₂ , our task is to report all positions i in T, such that the string P_1 P_2 and the string obtained by concatenating T[i,(i+|P ₁ |-1)] and T[(i+|P ₁ |+1),(i+p-1)] are a parameterized match. We show that such queries can be answered in optimal O(p+occ) time per query using an O(n log n) space index. We then show how to compress our index into O(n log σ) space but with a higher query cost of O(p(log log n+logσ)+occ logσ).

查看译文

关键词

Parameterized Text Indexing,Wildcard,Heavy Path,Succinct Data Structures,Suffix Tree

AI 理解论文

溯源树

样例

生成溯源树，研究论文发展脉络

Chat Paper

正在生成论文摘要