Random Generation and Enumeration of Proper ... - Semantic Scholar

Report 2 Downloads 83 Views
Random Generation and Enumeration of Proper Interval Graphs Toshiki Saitoh(JAIST) Katsuhisa Yamanaka(Univ. Electro-Communi.) Masashi Kiyomi(JAIST) Ryuhei Uehara(JAIST)

Introduction 

Processing of huge amounts of data  

Data mining, bioinformatics, etc The data have a certain structure 



Interval graphs in bioinformatics

Our problems  

(P.I.G. : proper interval graphs)

Random generation of P.I.G. Enumeration of P.I.G. 



P.I.G.: subclass of interval graphs P.I.G. ⇒ strings of parentheses

Generating Enumerating

Known Algorithms 

Generation of a string of parentheses 

D.B.Arnold and M.R.Sleep, 1980 

can’t generate P.I.G. uniformly at random

Not one-to-one correspondence 

Enumeration of strings of parentheses 

D.E.Knuth, 2005 

can’t enumerate every P.I.G. in O(1) time

constant size of differences in string ⇔ large size of differences in P.I.G.

Our Algorithms 

Random Generation  

Input: Natural number n Output: Connected P. I. G. of n vertices   



Uniformly at random Using a counting algorithm O(n+m) time (m: #edges)

Enumeration  

Input: Natural number n Output: All the connected P. I. G. of n vertices   

Without duplication Based on reverse search algorithm O(1) time/graph

Interval Graphs 

Have interval representations

Proper Interval Graphs 

Have unit interval representations Every interval has therepresentations same length String

Definition 

String Representation 

Encodes a unit interval representation by a string 

Sweep the unit interval representation from left to right  

Left endpoint → “(” : left parenthesis Right endpoint → “)” : right parenthesis Right endpoints appear in order of their left endpoint appearances

((())(())) Unit Interval Representation

String Representation

Height = # “(” - # “)”

String Representation 

Property of string rep. of P. I. G. of n vertices 

Number of parentheses: 2n 



Number of “(”: n

Number of “)”: n

Non-negative 

Each left parenthesis exists in the left side of its right parenthesis

( ( ( ) ) ( ( ) ) ) +1 +1 +1 -1 -1 +1 +1 -1 -1 -1

01 2 3 21 2 3 210

Height Each number is non-negative

0

String Representation 

Property of string rep. of P. I. G. of n vertices 

Number of parentheses: 2n 



Number of “(”: n

Number of “)”: n

Non-negative 

Each left parenthesis exists in the left side of its right parenthesis

Negative height

( ( ) ) ) ( 0

? Not correspond to any interval rep..

String Representation 

Property of string rep. of P. I. G. of n vertices 

Number of parentheses: 2n 



Number of “(”: n

Number of “)”: n

Non-negative 

Each left parenthesis exists in the left side of its right parenthesis

Each component corresponds to each area that is Therebounded are moreby than 2 places whose heightsare are0.0. 2 places whose heights

( ( ) ) ( ) 0

Unit Interval Rep.

Graph Rep.

String Representation 

Observation 1 

String rep. of connected P. I. G. 

Have exactly 2 places whose heights are 0. 

The left end and the right end

The string excepted both ends parentheses is non-negative

(( ( ( ) ) ) ( ( ) ) ))

0 0

String Representation 

Lemma 1. (X. Dell, P. Hell, J. Huang, 1996) 

A connected P. I. G. has only one or two string rep. This graph has only two string representations. (()(()())) different strings

Proper Interval Graph

Unit Interval Rep.

String Rep.

String Representation 

Lemma 1. (X. Dell, P. Hell, J. Huang, 1996) 

A connected P. I. G. has only one or two string rep.

This graph has only one string representation. (()(())())

Proper Interval Graph

Unit Interval Rep.

String Rep.

reversible

Random Generation Algorithm of Proper Interval Graphs

Random Generation Algorithm 

Generate a string rep. uniformly at random 

Using a counting algorithm 

(Generalized) Catalan number

( ( ( ( ) ) ( ) ( ) ) ) String rep. : Path on the area C ( n' ) 

0 0

1  2n'    n'1  n' 

Adjust the Generation Probability Not easy

Non-reversible strings

… (()(()()))

… ((()())())

… … …

Decrease the generation probability

String rep.

Reversible strings (()(())())

A generation probability of a graph corresponding to non-reversible strings is higher than that of reversible one

Sn: # non-reversible strings Rn: # reversible strings

Adjust the Generation Probability String rep. Non-reversible strings

S n  Rn  C (n)  n   Rn    n / 2 

… (()(()())) ((()())())

… … … … …

S  Rn Prob: n S n  2 Rn



Case 1

Reversible strings (()(())())

Case 2 Rn Prob: S n  2 Rn

(()(())())

Uniformly at random

Generalized Catalan Number

Case 1 

i  2n  i    C (n, i)  2n  i  n 

Generation of a string uniformly at random 

Generate parentheses from left 

Select “(” or “)”

( ( ( ) )( ( ) () ) ) p = C(k, hl) q = C(k, hr)

k: # remaining parentheses h: Height

“(”

p h(k  h  2)  : p  q 2k (h  1)

“)” :

q (k  h)(h  2) Time  pq 2k (h  1)

complexity •String Rep.:O(n) •Graph Rep.:O(n+m) m: # edges

Generalized Catalan Number

Case 2 

Symmetric

i  2n  i    C (n, i)  2n  i  n 

Generation of reversible string uniformly at random  1. 2.

Generate a half of the string Choose the height at the center Generate parentheses from the left end 

Select “(” or “)” h 1  n 1    n  1  (n  h) / 2 

( ( p: complicated!

k: # remaining parentheses h: Height

Generalized Catalan Number

Case 2 

i  2n  i    C (n, i)  2n  i  n 

Generation of reversible string uniformly at random  1. 2.

Generate a half of the string from the center to the right end Choose the height at the center Generate parentheses from the center 

Select “(” or “)”

) (( ) ) ) ) p = C(k, hl) q = C(k, hr)

h 1  n 1    n  1  (n  h) / 2 

k: # remaining parentheses h: Height

“(”

p h(k  h  2)  : p  q 2k (h  1)

“)” :

q (k  h)(hTime  2)  pq 2k (h  1)

complexity •String Rep.:O(n) •Graph Rep.:O(n+m) m: # edges

Enumeration Algorithm of Proper Interval Graphs

Simple Enumeration Algorithm remove 1 edge

Not P.I.G.

 

Huge memory Low speed

Has the obtained graph outputted?

Reverse Search Algorithm ((((()))))

remove

(((()())))



(((())()))

1 edge



((()()()))

((())(()))



Our algorithm 

(()(())())





(()()()())

Parent-child

(((()))())

((()())())

((())()())

Spanning Tree

Canonical string O(1) time/graph O(n) space

Canonical string x≦ reverse of the x

Parent-Child Relation ((((()))))

 

(((()())))

(((())())) ((()()())) ((())(()))

Root: ( ( … ( ) ) … ) Parent of canonical string x 

Replace the leftmost “) (” of x with “( )”  Parent of x is canonical  The root is the ancestor of x

Tree (n=5) root

child

((((()))))

Enumeration of all the strings by traversing the tree from the root

(((()()))) (((())()))

((()()())) ((())(()))

parent

(((()))()) ((()())()) (()(())())

Find child: O(1) time Output only the differences Prepostorder manner

Output string: O(1) time

((())()())

(()()()())

Conclusion and Future Work 

Random Generation and Enumeration of Connected Proper Interval Graphs of n vertices   



Random Generation:O(n+m) time Enumeration:O(1) time/graph n vertices ⇒ at most n vertices

Random Generation and Enumeration of Interval Graphs