c# - Find intersection group of sorted integer arrays -
let's have integer short sorted arrays , need find intersection equal or more predefined constant. here code , demonstrates want better can explain in words. problem speed. code working slow. takes 15 sec on 2000 elements array(on slow machine). ofcourse can implement own intersection method , parallize code give limited improvement. execution time growing n^2 or , 500k arrays takes very long time. how can rewrite algorithm better perfomance? not limited c# language maybe cpu or gpu has special instructions such job.
example: input: 1,3,7,8 2,3,8,10 3,10,11,12,13,14 minsupport = 1 output: 1 , 2: 2, 8 1 , 3: 3 2 , 3: 3, 10 var minsupport = 2; var random = new random(datetime.now.millisecond); // numbers each array unique var sortedarrays = enumerable.range(0,2000) .select(x => enumerable.range(0,30).select(t => random.next(1000)).distinct() .tolist()).tolist(); var result = new list<int[]>(); var resultintersection = new list<list<int>>(); foreach (var array in sortedarrays) { array.sort(); } var sw = stopwatch.startnew(); //****main part*****// (int = 0; < sortedarrays.count-1; i++) { (int j = i+1; j < sortedarrays.count; j++) { var intersect = sortedarrays[i].intersect(sortedarrays[j]).tolist(); if(intersect.count()>=minsupport) { result.add( new []{i,j}); resultintersection.add(intersect); } } } //*****************// sw.stop(); console.writeline(sw.elapsed); edit:
now takes 9 sec vs 15 sec old algorithm on 2000 elements. well...ofcourse not fast enough.
//****main part*****// // number(max value array can contains) known var maxvalue = 1000; var reverseindexdict = new dictionary<int,list<int>>(); (int = 0; < maxvalue; i++) { reverseindexdict[i] = new list<int>(); } (int = 0; < sortedarrays.count; i++) { (int j = 0; j < sortedarrays[i].count; j++) { reverseindexdict[sortedarrays[i][j]].add(i); } } var temparr = new list<int>(); (int = 0; < sortedarrays.count; i++) { temparr.clear(); (int j = 0; j < sortedarrays[i].count; j++) { temparr.addrange(reverseindexdict[j]); } result.addrange(temparr.groupby(x => x).where(x => x.count()>=minsupport).select(x => new[]{i,x.key}).tolist()); } result = result.where(x => x[0]!=x[1]).tolist(); (int = 0; < result.count; i++) { resultintersection.add(sortedarrays[result[i][0]].intersect(sortedarrays[result[i][1]]).tolist()); } //*****************// edit:
some improvent.
//****main part*****// // number(max value array can contains) known var maxvalue = 1000; var reverseindexdict = new list<int>[maxvalue]; (int = 0; < maxvalue; i++) { reverseindexdict[i] = new list<int>(); } (int = 0; < sortedarrays.count; i++) { (int j = 0; j < sortedarrays[i].count; j++) { reverseindexdict[sortedarrays[i][j]].add(i); } } (int = 0; < sortedarrays.count; i++) { var temparr = new dictionary<int, list<int>>(); (int j = 0; j < sortedarrays[i].count; j++) { var sortedarraysij = sortedarrays[i][j]; (int k = 0; k < reverseindexdict[sortedarraysij].count; k++) { if(!temparr.containskey(reverseindexdict[sortedarraysij][k])) { temparr[reverseindexdict[sortedarraysij][k]] = new[]{sortedarraysij}.tolist(); } else { temparr[reverseindexdict[sortedarraysij][k]].add(sortedarrays[i][j]); } } } (int j = 0; j < reverseindexdict.length; j++) { if(reverseindexdict[j].count>=minsupport) { result.add(new[]{i,j}); resultintersection.add(reverseindexdict[j]); } } } // , here filtering collections //*****************//
there 2 solutions:
let suppose have 3 sorted arrays , have find intersection between them. traverse first array , run binary search on rest of 2 arrays element in first array. if respective binary search on 2 list gave positive, increment counter of intersection.
result = list element in array1: status1 = binarysearch(element, array2) status2 = binarysearch(element, array2) status = status & status if status == true: count++ if count == max_intersection: result.append(element) breaktime complexity : n * m * log(n),
where,
n = number of element in array
m = number of arraysthis solution works if number in arrays positive integers. calculate maximum , minimum number out of total elements in sorted arrays. sorted, can determine surveying start , end element of sorted arrays given. let greatest number max , lowest number min. create array of size max - min , fill zero. let suppose have 3 arrays, start traversing first array , and go respective index , increment value in created array. mentioned below:
element 5 in array 1, new_array[5]+=1traverse 3 sorted list , perform operation mentioned above. @ end traverse new_array , value equal 3, these indexes intersection result.
time complexity : o(n) + o(n) + .. = o(n)
space complexity : o(maximum_element - minimum_element)
where,
n = number of elements in array.
Comments
Post a Comment