0%

Base64编码与解码

Base64编码是一种用64个字符来表示任意二进制数据的方法。

简介

当需要在原本适用于存储或传输文本或字符流的环境中存储或传输二进制数据时,可以考虑使用Base64编码方法。
Base64编码方法广泛应用于

可以将图片文件或多语言的字符串使用Base64编码,便于存储及传输

使用Base64编码时,会把每3个8位共24字节(3*8=24)转化为4个6位的24字节(4*6=24),之后在每份的6位前面补两个值为0的位,形成新的8位(1字节)的形式。即原先的3个字节转变成为4个字节,转化后的字节值的范围为00000000至00111111,即0至2^6,使用大小为64的编码表可以完成所有值的表示。在转化的过程中,如果遇到字符不足3个字节,则用0在尾部填充补够3字节,对应输出字符使用’=’,因此编码后输出的文本末尾可能会出现1或2个’=’。Base64解码的过程则与上述过程相反。
常用的一种Base64编码表如下表所示:

Value Char Value Char Value Char Value Char
0 A 16 Q 32 g 48 w
1 B 17 R 33 h 49 x
2 C 18 S 34 i 50 y
3 D 19 T 35 j 51 z
4 E 20 U 36 k 52 0
5 F 21 V 37 l 53 1
6 G 22 W 38 m 54 2
7 H 23 X 39 n 55 3
8 I 24 Y 40 o 56 4
9 J 25 Z 41 p 57 5
10 K 26 a 42 q 58 6
11 L 27 b 43 r 59 7
12 M 28 c 44 s 60 8
13 N 29 d 45 t 61 9
14 O 30 e 46 u 62 +
15 P 31 f 47 v 63 /

应用案例

将字符串"Man"进行Base64编码:

"Man"中三个字符按转化为二进制依次为:

1
2
3
'M' -> 01001101
'a' -> 01100001
'n' -> 01101110

拼接在一起即"Man"的二进制格式为01001101-01100001-01101110,将其分为每组6位的4组得到 010011-010110-000101-101110。每组前部补两个0,补全为1个字节,共四个字节,此时每个字节对应的十进制值在0-2^6之间,即0-64。根据编码表得到对应的经过Base64编码后的字符串为"TWFu"。上述过程可以参看下表:

source ASCII (if <128) M a n
source octets 77 (0x4d) 97 (0x61) 110 (0x6e)
Bit pattern 0 1 0 0 1 1 0 1 0 1 1 0 0 0 0 1 0 1 1 0 1 1 1 0
Index 19 22 5 46
Base64-encoded T W F u
encoded octets 84 (0x54) 87 (0x57) 70 (0x46) 117 (0x75)

C++实现Base64编码与解码

本节代码来源 http://www.adp-gmbh.ch/cpp/common/base64.html

涉及到两个文件:头文件 base64.h,源文件 base64.cpp,除此之外还有使用Base64相关函数的测试用例代码。

base64.h

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
/*
base64.cpp and base64.h

Copyright (C) 2004-2008 René Nyffenegger

This source code is provided 'as-is', without any express or implied
warranty. In no event will the author be held liable for any damages
arising from the use of this software.

Permission is granted to anyone to use this software for any purpose,
including commercial applications, and to alter it and redistribute it
freely, subject to the following restrictions:

1. The origin of this source code must not be misrepresented; you must not
claim that you wrote the original source code. If you use this source code
in a product, an acknowledgment in the product documentation would be
appreciated but is not required.

2. Altered source versions must be plainly marked as such, and must not be
misrepresented as being the original source code.

3. This notice may not be removed or altered from any source distribution.

René Nyffenegger rene.nyffenegger@adp-gmbh.ch

*/
#include <string>

std::string base64_encode(unsigned char const* , unsigned int len);
std::string base64_decode(std::string const& s);

base64.cpp

1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
#include "base64.h"
#include <iostream>

static const std::string base64_chars =
"ABCDEFGHIJKLMNOPQRSTUVWXYZ"
"abcdefghijklmnopqrstuvwxyz"
"0123456789+/";


static inline bool is_base64(unsigned char c) {
return (isalnum(c) || (c == '+') || (c == '/'));
}

std::string base64_encode(unsigned char const* bytes_to_encode, unsigned int in_len) {
std::string ret;
int i = 0;
int j = 0;
unsigned char char_array_3[3];
unsigned char char_array_4[4];

while (in_len--) {
char_array_3[i++] = *(bytes_to_encode++);
if (i == 3) {
char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
char_array_4[3] = char_array_3[2] & 0x3f;

for(i = 0; (i <4) ; i++)
ret += base64_chars[char_array_4[i]];
i = 0;
}
}

if (i)
{
for(j = i; j < 3; j++)
char_array_3[j] = '\0';

char_array_4[0] = (char_array_3[0] & 0xfc) >> 2;
char_array_4[1] = ((char_array_3[0] & 0x03) << 4) + ((char_array_3[1] & 0xf0) >> 4);
char_array_4[2] = ((char_array_3[1] & 0x0f) << 2) + ((char_array_3[2] & 0xc0) >> 6);
char_array_4[3] = char_array_3[2] & 0x3f;

for (j = 0; (j < i + 1); j++)
ret += base64_chars[char_array_4[j]];

while((i++ < 3))
ret += '=';

}

return ret;

}

std::string base64_decode(std::string const& encoded_string) {
int in_len = encoded_string.size();
int i = 0;
int j = 0;
int in_ = 0;
unsigned char char_array_4[4], char_array_3[3];
std::string ret;

while (in_len-- && ( encoded_string[in_] != '=') && is_base64(encoded_string[in_])) {
char_array_4[i++] = encoded_string[in_]; in_++;
if (i ==4) {
for (i = 0; i <4; i++)
char_array_4[i] = base64_chars.find(char_array_4[i]);

char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

for (i = 0; (i < 3); i++)
ret += char_array_3[i];
i = 0;
}
}

if (i) {
for (j = i; j <4; j++)
char_array_4[j] = 0;

for (j = 0; j <4; j++)
char_array_4[j] = base64_chars.find(char_array_4[j]);

char_array_3[0] = (char_array_4[0] << 2) + ((char_array_4[1] & 0x30) >> 4);
char_array_3[1] = ((char_array_4[1] & 0xf) << 4) + ((char_array_4[2] & 0x3c) >> 2);
char_array_3[2] = ((char_array_4[2] & 0x3) << 6) + char_array_4[3];

for (j = 0; (j < i - 1); j++) ret += char_array_3[j];
}

return ret;
}

在代码中使用上述函数:

1
2
3
4
5
6
7
8
9
10
11
12
13
14
#include "base64.h"
#include <iostream>

int main() {
const std::string s = "ADP GmbH\nAnalyse Design & Programmierung\nGesellschaft mit beschränkter Haftung" ;

std::string encoded = base64_encode(reinterpret_cast<const unsigned char*>(s.c_str()), s.length());
std::string decoded = base64_decode(encoded);

std::cout << "encoded: " << encoded << std::endl;
std::cout << "decoded: " << decoded << std::endl;

return 0;
}

REFERENCE

https://en.wikipedia.org/wiki/Base64
http://www.adp-gmbh.ch/cpp/common/base64.html