What can be best appropriate HBase schema for RDBMS table -
i beginner hbase. want migrate rdbms table hbase.
table schema in rdbms :
field type collation null key default privileges comment --------------- ---------------- ----------------- ------ ------ ------- ------------ -- ------------------------------- ------- id int(16) unsigned (null) no pri (null) auto_increment select,insert,update,references user_id varchar(64) latin1_swedish_ci no mul (null) select,insert,update,references type_id int(11) (null) no (null) select,insert,update,references application_id int(16) unsigned (null) yes mul (null) select,insert,update,references title varchar(128) latin1_swedish_ci yes (null) select,insert,update,references body text latin1_swedish_ci yes (null) select,insert,update,references posted_time datetime (null) yes (null) select,insert,update,references template_params text latin1_swedish_ci yes (null) select,insert,update,references count int(11) (null) yes (null) select,insert,update,references reference_id int(16) (null) yes (null) select,insert,update,references viewer_id varchar(64) latin1_swedish_ci yes (null) select,insert,update,references here body , templete have json data in varchar format. want create schema table in hbase.
operation performed on data :
1. activity retrival user id 2. activity retrival viewer id 3. activity retrival particular type_id/particular type_id , user_id. 4. activity retrival made after t time. what appropiate schema this?
4. activity retrival made after t time. this not problem; hbase stores timestamps, , can query entries after time t.
as 1, 2 , 3, trying make access fast? if so, recommend creating 3 separate tables store data - yes, there redundancy, queries fast.
- use user_id row key, , store rest of values in columns
- use viewer_id row key, , store rest of values in columns
- use type_id , user_id row key, type_id in front of user_id. way, can query type_id if provided, , type_id , user_id if both provided. (note have scan here, , not use regular get().)
you can code using happybase python library follows:
con = happybase.connection() user = conn.table('user') viewer = conn.table('viewer') type_user = conn.table('type_user') def insert (user_id, viewer_id, type_id): user.put (user_id, {'viewer_id': viewer_id, 'type_id': type_id}) viewer.put (viewer_id, {'user_id': user_id, 'type_id': type_id}) type_user.put (type_id + user_id, {'viewer_id': viewer_id}) def get_user (user_id): return user.row(user_id) def get_viewer (viewer_id): return viewer.row(viewer_id) def get_type_user (type_id, user_id): if user_id == "": rowkey = type_id else rowkey = type_id + user_id # note use scan here match type_id if exists return type_user.scan(row_prefix=rowkey)
Comments
Post a Comment