c# - Find intersection group of sorted integer arrays -


let's have integer short sorted arrays , need find intersection equal or more predefined constant. here code , demonstrates want better can explain in words. problem speed. code working slow. takes 15 sec on 2000 elements array(on slow machine). ofcourse can implement own intersection method , parallize code give limited improvement. execution time growing n^2 or , 500k arrays takes very long time. how can rewrite algorithm better perfomance? not limited c# language maybe cpu or gpu has special instructions such job.

example:  input: 1,3,7,8 2,3,8,10 3,10,11,12,13,14  minsupport = 1  output:  1 , 2: 2, 8 1 , 3: 3 2 , 3: 3, 10 

    var minsupport = 2;     var random = new random(datetime.now.millisecond);      // numbers each array unique     var sortedarrays = enumerable.range(0,2000)     .select(x => enumerable.range(0,30).select(t => random.next(1000)).distinct()     .tolist()).tolist();     var result = new list<int[]>();     var resultintersection = new list<list<int>>();      foreach (var array in sortedarrays)     {         array.sort();     }      var sw = stopwatch.startnew();      //****main part*****//      (int = 0; < sortedarrays.count-1; i++)     {         (int j = i+1; j < sortedarrays.count; j++)         {             var intersect = sortedarrays[i].intersect(sortedarrays[j]).tolist();             if(intersect.count()>=minsupport)             {                 result.add( new []{i,j});                 resultintersection.add(intersect);             }         }     }      //*****************//      sw.stop();      console.writeline(sw.elapsed); 

edit:

now takes 9 sec vs 15 sec old algorithm on 2000 elements. well...ofcourse not fast enough.

//****main part*****//      // number(max value array can contains) known     var maxvalue = 1000;      var reverseindexdict = new dictionary<int,list<int>>();      (int = 0; < maxvalue; i++)     {         reverseindexdict[i] = new list<int>();     }      (int = 0; < sortedarrays.count; i++)     {         (int j = 0; j < sortedarrays[i].count; j++)         {             reverseindexdict[sortedarrays[i][j]].add(i);         }     }      var temparr = new list<int>();     (int = 0; < sortedarrays.count; i++)     {         temparr.clear();         (int j = 0; j < sortedarrays[i].count; j++)         {             temparr.addrange(reverseindexdict[j]);         }          result.addrange(temparr.groupby(x => x).where(x => x.count()>=minsupport).select(x => new[]{i,x.key}).tolist());      }      result = result.where(x => x[0]!=x[1]).tolist();       (int = 0; < result.count; i++)     {         resultintersection.add(sortedarrays[result[i][0]].intersect(sortedarrays[result[i][1]]).tolist());     }        //*****************// 

edit:

some improvent.

//****main part*****//      // number(max value array can contains) known     var maxvalue = 1000;      var reverseindexdict = new list<int>[maxvalue];      (int = 0; < maxvalue; i++)     {         reverseindexdict[i] = new list<int>();     }      (int = 0; < sortedarrays.count; i++)     {         (int j = 0; j < sortedarrays[i].count; j++)         {             reverseindexdict[sortedarrays[i][j]].add(i);         }     }        (int = 0; < sortedarrays.count; i++)     {         var temparr = new dictionary<int, list<int>>();          (int j = 0; j < sortedarrays[i].count; j++)         {             var sortedarraysij = sortedarrays[i][j];               (int k = 0; k < reverseindexdict[sortedarraysij].count; k++)             {                 if(!temparr.containskey(reverseindexdict[sortedarraysij][k]))                 {                     temparr[reverseindexdict[sortedarraysij][k]] = new[]{sortedarraysij}.tolist();                 }                 else                 {                    temparr[reverseindexdict[sortedarraysij][k]].add(sortedarrays[i][j]);                 }              }         }           (int j = 0; j < reverseindexdict.length; j++)         {             if(reverseindexdict[j].count>=minsupport)             {                 result.add(new[]{i,j});                 resultintersection.add(reverseindexdict[j]);             }         }      }      // , here filtering collections      //*****************// 

there 2 solutions:

  1. let suppose have 3 sorted arrays , have find intersection between them. traverse first array , run binary search on rest of 2 arrays element in first array. if respective binary search on 2 list gave positive, increment counter of intersection.

    result = list element in array1:     status1 = binarysearch(element, array2)     status2 = binarysearch(element, array2)     status = status & status     if status == true:         count++         if count == max_intersection:             result.append(element)             break 

    time complexity : n * m * log(n),
    where,
    n = number of element in array
    m = number of arrays

  2. this solution works if number in arrays positive integers. calculate maximum , minimum number out of total elements in sorted arrays. sorted, can determine surveying start , end element of sorted arrays given. let greatest number max , lowest number min. create array of size max - min , fill zero. let suppose have 3 arrays, start traversing first array , and go respective index , increment value in created array. mentioned below:

    element 5 in array 1, new_array[5]+=1 

    traverse 3 sorted list , perform operation mentioned above. @ end traverse new_array , value equal 3, these indexes intersection result.

    time complexity : o(n) + o(n) + .. = o(n)
    space complexity : o(maximum_element - minimum_element)
    where,
    n = number of elements in array.


Comments

Popular posts from this blog

java - Play! framework 2.0: How to display multiple image? -

gmail - Is there any documentation for read-only access to the Google Contacts API? -

php - Controller/JToolBar not working in Joomla 2.5 -