HBase Java API

目录

  • 1. HBase Java API
    • 1.1 Admin
    • 1.2 TableDescriptor
    • 1.3 HBaseConfiguration
    • 1.4 Table
    • 1.5 Put
    • 1.6 Get
    • 1.7 Delete
    • 1.8 Result
  • 2. HBase Java API示例
    • 2.1 pom.xml中依赖配置
    • 2.2 工具类Util
    • 2.3 创建表格Creation
    • 2.4 插入数据Insertion
    • 2.5 获取数据Selection
    • 2.6 删除数据Deletion
    • 2.7 HBase Java API代码
    • 2.8 结果
  • 参考

1. HBase Java API
1.1 Admin

  Admin是一个操作HBase非常重要的类,用于管理表,这个类属于org.apache.hadoop.hbase.client。使用这个类,可以执行管理员任务。使用Connection.getAdmin()方法来获取管理员的实例。

Admin中部分方法 功能
void createTable(TableDescriptor desc) 创建一个新的表
void disableTable(TableName tableName) 禁用表
void deleteTable(TableName tableName) 删除表

1.2 TableDescriptor

  TableDescriptor包含了一个HBase表的结构信息。org.apache.hadoop.hbase.client.TableDescriptorBuilder提供了更简便的方式来创建TableDescriptor对象。

TableDescriptor中部分方法 功能
TableDescriptorBuilder newBuilder(final TableName name) 构造一个指定TableName对象的表描述符
TableDescriptorBuilder setColumnFamily(final ColumnFamilyDescriptor family) 在表描述符中设置列族描述符

1.3 HBaseConfiguration

  添加HBase的配置到配置文件,这个类属于org.apache.hadoop.hbase.client

方法 功能
Configuration create() 创建使用HBase的资源配置

1.4 Table

  Table是HBase表中HBase的内部类,用于实现单个HBase表的通信,这个类属于org.apache.hadoop.hbase.client

Table中部分方法 功能
void close() 释放Table的所有资源
void delete(Delete delete) 删除指定的单元格/行
Result get(Get get) 检索指定的单元格/行
void put(Put put) 将数据插入表中

1.5 Put

  此类用于为单个行执行Put操作,这个类属于org.apache.hadoop.hbase.client

Put中部分方法 功能
Put(byte[] row) 创建一个指定行键的Put操作
Put addColumn(byte[] family, byte[] qualifier, byte[] value) 添加指定的列和值到Put

1.6 Get

  此类用于对单行执行Get操作,这个类属于org.apache.hadoop.hbase.client

Get中部分方法 功能
Get(byte[] row) 为指定行键创建一个Get操作
Get addFamily(byte[] family) 检索来自特定列族的所有列

1.7 Delete

  这个类用于对单行执行删除操作,这个类属于org.apache.hadoop.hbase.client

Delete中部分方法 功能
Delete(byte[] row) 创建一个指定行键的Delete操作
Delete addFamily(final byte[] family) 删除指定列族中所有列的所有版本
Delete addColumn(final byte[] family, final byte[] qualifier) 删除指定列的最新版本

1.8 Result

  这个类是用来获取Get或扫描查询的单行结果。

Result中部分方法 功能
byte[] getValue(byte[] family, byte[] qualifier) 获取指定列的最新版本

2. HBase Java API示例
2.1 pom.xml中依赖配置

  <dependencies>
    <dependency>
      <groupId>junit</groupId>
      <artifactId>junit</artifactId>
      <version>4.11</version>
      <scope>test</scope>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-common</artifactId>
      <version>2.5.10</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-client</artifactId>
      <version>2.5.10</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hbase</groupId>
      <artifactId>hbase-server</artifactId>
      <version>2.5.10</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-common</artifactId>
      <version>3.3.6</version>
    </dependency>
    <dependency>
      <groupId>org.apache.hadoop</groupId>
      <artifactId>hadoop-auth</artifactId>
      <version>3.3.6</version>
    </dependency>
  </dependencies>

2.2 工具类Util

  Util负责获取和关闭与HBase通信的连接。

import java.io.IOException;

import org.apache.hadoop.conf.Configuration;
import org.apache.hadoop.hbase.HBaseConfiguration;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.ConnectionFactory;

public class Util {
    public static Connection getConnection() throws IOException {
        Configuration conf = HBaseConfiguration.create();
        Connection conn = ConnectionFactory.createConnection(conf);
        return conn;
    }

    public static void closeConnection(Connection conn) throws IOException {
        conn.close();
    }
}

2.3 创建表格Creation

  Creation负责创建HBase中的表格,其中的void create(Connection conn, String tableName, String[] columnFamilies)支持在建表的同时建立1-3个列族。

import java.io.IOException;

import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptor;
import org.apache.hadoop.hbase.client.ColumnFamilyDescriptorBuilder;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.TableDescriptorBuilder;

public class Creation {
        public static void create(Connection conn, String tableName, String[] columnFamilies) throws IOException {
        if (columnFamilies.length == 0) {
            System.out.println("please provide at least one column family.");
            return;
        }
        if (columnFamilies.length > 3) {
            System.out.println("please reduce the number of column families.");
            return;
        }

        Admin admin = conn.getAdmin();
        TableName tableName2 = TableName.valueOf(tableName);

        if (admin.tableExists(tableName2)) {
            System.out.println("table exists!");
            return;
        }

        TableDescriptorBuilder tableDescBuilder = TableDescriptorBuilder.newBuilder(tableName2);
        for (String cf : columnFamilies) {
            ColumnFamilyDescriptor colFamilyDesc = ColumnFamilyDescriptorBuilder.of(cf);
            tableDescBuilder.setColumnFamily(colFamilyDesc);
        }
        admin.createTable(tableDescBuilder.build());
        System.out.println("create table success!");
        admin.close();
    }
}

2.4 插入数据Insertion

  Insertion负责插入数据,方法的功能介绍如下。

方法 功能
void put(Connection conn, String tableName, String row, String family, String column, String data) 对已有的HBase表中指定列插入数据
void put(Connection conn, String tableName, String row, String family, String[] columns, String[] datas) 对已有的HBase表中指定列族(该列族下有很多列)插入数据
import java.io.IOException;

import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.Put;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class Insertion {
    public static void put(Connection conn, String tableName, String row, String family, String column, String data)
            throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Put p1 = new Put(Bytes.toBytes(row));
        p1.addColumn(Bytes.toBytes(family), Bytes.toBytes(column), Bytes.toBytes(data));
        table.put(p1);
        System.out.println(String.format("put '%s', '%s', '%s:%s', '%s'", tableName, row, family, column, data));
    }

    public static void put(Connection conn, String tableName, String row, String family, String[] columns,
            String[] datas)
            throws IOException {
        if (columns.length != datas.length) {
            System.out.println("the length of columns must equal the length of datas.");
            return;
        }
        for (int i = 0; i < columns.length; i++) {
           put(conn, tableName, row, family, columns[i], datas[i]);
        }
    }
}

2.5 获取数据Selection

  Selection负责获取数据,方法的功能介绍如下。

方法 功能
get(Connection conn, String tableName, String row) 打印指定表中指定行键的记录
get(Connection conn, String tableName, String row, String family) 打印指定表中指定列族的记录
get(Connection conn, String tableName, String row, String family, String qualifier) 打印指定表中指定列族下指定列的记录
scan(Connection conn, String tableName) 打印指定表的记录
import java.io.IOException;

import org.apache.hadoop.hbase.Cell;
import org.apache.hadoop.hbase.CellUtil;
import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.Get;
import org.apache.hadoop.hbase.client.Result;
import org.apache.hadoop.hbase.client.ResultScanner;
import org.apache.hadoop.hbase.client.Scan;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class Selection {
    public static void get(Connection conn, String tableName, String row) throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Get g1 = new Get(Bytes.toBytes(row));
        Result res = table.get(g1);
        for (Cell cell : res.listCells()) {
            String family2 = Bytes.toString(CellUtil.cloneFamily(cell));
            String column2 = Bytes.toString(CellUtil.cloneQualifier(cell));
            String data = Bytes.toString(CellUtil.cloneValue(cell));
            System.out.println(String.format("get '%s', '%s': %s:%s: %s", tableName, row, family2, column2, data));
        }
        System.out.println();
    }

    public static void get(Connection conn, String tableName, String row, String family) throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Get g1 = new Get(Bytes.toBytes(row));
        g1.addFamily(Bytes.toBytes(family));
        Result res = table.get(g1);
        for (Cell cell : res.listCells()) {
            String column = Bytes.toString(CellUtil.cloneQualifier(cell));
            String data = Bytes.toString(CellUtil.cloneValue(cell));
            System.out.println(String.format("get '%s', '%s', '%s': %s: %s", tableName, row, family, column, data));
        }
        System.out.println();
    }

    public static void get(Connection conn, String tableName, String row, String family, String qualifier)
            throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Get g1 = new Get(Bytes.toBytes(row));
        Result res = table.get(g1);
        System.out.println(String.format("get '%s', '%s', '%s:%s': %s\n", tableName, row, family, qualifier,
                new String(res.getValue(family.getBytes(), qualifier.getBytes()))));
    }

    public static void scan(Connection conn, String tableName) throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Scan scan = new Scan();
        ResultScanner scanner = table.getScanner(scan);
        System.out.println("scan: ");
        for (Result res = scanner.next(); res != null; res = scanner.next()) {
            for (Cell cell : res.listCells()) {
                String row = Bytes.toString(CellUtil.cloneRow(cell));
                String columnFamily = Bytes.toString(CellUtil.cloneFamily(cell));
                String column = Bytes.toString(CellUtil.cloneQualifier(cell));
                String data = Bytes.toString(CellUtil.cloneValue(cell));
                System.out.println(String.format("row: %s, family: %s, column; %s, data: %s", row, columnFamily,
                        column, data));
            }
        }
        scanner.close();
    }
}

2.6 删除数据Deletion

  Deletion负责删除数据,方法的功能介绍如下。

方法 功能
void delete(Connection conn, String tableName) 删除指定表
void delete(Connection conn, String tableName, String row) 删除指定表中指定行键的记录
void delete(Connection conn, String tableName, String row, String family) 删除指定表中指定列族的记录
void delete(Connection conn, String tableName, String row, String family, String column) 删除指定表中指定列最新的记录
void delete(Connection conn, String tableName, String row, String family, String[] columns) 删除指定表中指定列族下多个列最新的记录
import java.io.IOException;

import org.apache.hadoop.hbase.TableName;
import org.apache.hadoop.hbase.client.Admin;
import org.apache.hadoop.hbase.client.Connection;
import org.apache.hadoop.hbase.client.Delete;
import org.apache.hadoop.hbase.client.Table;
import org.apache.hadoop.hbase.util.Bytes;

public class Deletion {
    public static void delete(Connection conn, String tableName) throws IOException {
        Admin admin = conn.getAdmin();
        TableName tableName2 = TableName.valueOf(tableName);
        if (admin.tableExists(tableName2)) {
            try {
                admin.disableTable(tableName2);
                admin.deleteTable(tableName2);
            } catch (IOException e) {
                e.printStackTrace();
                System.out.println("the deletion of " + tableName + " fails");
                admin.close();
            }
            System.out.println("the deletion of " + tableName + " succeeds");
        }
        admin.close();
    }

    public static void delete(Connection conn, String tableName, String row) throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Delete d1 = new Delete(Bytes.toBytes(row));
        table.delete(d1);
    }

    public static void delete(Connection conn, String tableName, String row, String family) throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Delete d1 = new Delete(Bytes.toBytes(row));
        d1.addFamily(Bytes.toBytes(family));
        table.delete(d1);
    }

    public static void delete(Connection conn, String tableName, String row, String family, String column)
            throws IOException {
        Table table = conn.getTable(TableName.valueOf(tableName));
        Delete d1 = new Delete(Bytes.toBytes(row));
        d1.addColumn(Bytes.toBytes(family), Bytes.toBytes(column));
        table.delete(d1);
    }

    public static void delete(Connection conn, String tableName, String row, String family, String[] columns) throws IOException {
        for (String col : columns) {
            delete(conn, tableName, row, family, col);
        }
    }
}

2.7 HBase Java API代码

  这段代码通过HBase Java API创建如下的表格:

name grade course
chinese math english
xiapi 1 97 128 85
xiaoxue 2 90 120 90
liming 3

import org.apache.hadoop.hbase.client.Connection;

public class App {
    public static void main(String[] args) throws Exception {
        Connection conn = Util.getConnection();
        String[] col1 = { "chinese", "math", "english" };
        Deletion.delete(conn, "scores");
        Creation.create(conn, "scores", new String[] { "grade", "course" });
        Insertion.put(conn, "scores", "liming", "grade", "", "3");
        Insertion.put(conn, "scores", "xiapi", "grade", "", "1");
        Insertion.put(conn, "scores", "xiapi", "course", col1, new String[] { "97", "128", "85" });
        Insertion.put(conn, "scores", "xiaoxue", "grade", "", "2");
        Insertion.put(conn, "scores", "xiaoxue", "course", col1, new String[] { "90", "120", "90" });
        Selection.get(conn, "scores", "xiaoxue", "grade");
        Selection.get(conn, "scores", "xiaoxue", "course");
        Selection.get(conn, "scores", "xiapi", "course", "math");
        Selection.get(conn, "scores", "xiapi");
        Selection.scan(conn, "scores");
        Deletion.delete(conn, "scores", "xiapi", "grade");
        Deletion.delete(conn, "scores", "xiapi", "course", "math");
        Deletion.delete(conn, "scores", "xiaoxue", "course");
        Deletion.delete(conn, "scores", "liming");
        Selection.scan(conn, "scores");
        Deletion.delete(conn, "scores");
        Util.closeConnection(conn);
    }
}

2.8 结果

参考

https://blog.csdn.net/weixin_44131414/article/details/116749786解决了这个问题。

吴章勇 杨强著 大数据Hadoop3.X分布式处理实战

版权声明

   站内部分内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供网络资源分享服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌抄袭侵权/违法违规的内容, 请 联系我们 一经核实,立即删除。并对发布账号进行永久封禁处理。在为用户提供最好的产品同时,保证优秀的服务质量。


本站仅提供信息存储空间,不拥有所有权,不承担相关法律责任。

给TA打赏
共{{data.count}}人
人已打赏
大数据

精益数据分析在线阅读(《精益数据分析》)

2025-2-25 15:20:03

大数据

es查询一万条以上数据(es一次查询出百万条数据)

2025-2-25 15:20:05

0 条回复 A文章作者 M管理员
    暂无讨论,说说你的看法吧
个人中心
购物车
优惠劵
今日签到
有新私信 私信列表
搜索