Skip to content

Commit

Permalink
Beta版发布
Browse files Browse the repository at this point in the history
  • Loading branch information
siaikin committed May 27, 2019
1 parent 6216cde commit c51ce99
Show file tree
Hide file tree
Showing 40 changed files with 13,097 additions and 2 deletions.
1 change: 1 addition & 0 deletions .idea/.name

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/encodings.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

12 changes: 12 additions & 0 deletions .idea/mdReverse.iml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/misc.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

8 changes: 8 additions & 0 deletions .idea/modules.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

6 changes: 6 additions & 0 deletions .idea/vcs.xml

Some generated files are not rendered by default. Learn more about how customized files appear on GitHub.

585 changes: 585 additions & 0 deletions .idea/workspace.xml

Large diffs are not rendered by default.

33 changes: 31 additions & 2 deletions README.md
Original file line number Diff line number Diff line change
@@ -1,2 +1,31 @@
# nwodkram
written by JavaScript. Convert HTML to Markdown.
# mdReserve
将HTML文本转换为Markdown格式文本。由JavaScript编写

[Demo](https://abc1310054026.github.io/md-reserve/)

* [开发进度](./development.md)
* [English Version](./README_EN.md)
## 安装
npm:
```
npm install md-reserve
```
## 用法
###### ES6 Modules
```javascript
import {MdReverse} from 'md-reverse';

const mdReserve = new MdReverse();
mdReserve.toMarkdown(`<h1>Hello World!</h1>`);
```
###### ES5
```javascript
var MR = require('md-reverse');
var mdReserve = new MR.MdReverse();
mdReserve.toMarkdown(`<h1>Hello World!</h1>`);
```
## 提示
* 因为各个网站的网页结构千奇百怪,做不到匹配所有网站,建议转换的HTML文本本身就是由Markdown编写的或符合Markdown规范。
在转换非Markdown规范的HTML文本可能准确性会大大下降。
* 目前仅能支持Markdown的[基本语法](https://www.markdownguide.org/basic-syntax)[扩展语法](https://www.markdownguide.org/extended-syntax)部分会在之后陆续完成。
* 为了提高转换的准确性,对于无法识别的HTML标签,`md-reserve`会将其删除。这个问题在扩展语法部分完成后会尝试解决。
26 changes: 26 additions & 0 deletions README_EN.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,26 @@
# mdReserve
Convert HTML to Markdown. Written by JavaScript.

[Demo](https://abc1310054026.github.io/md-reserve/)

* [development schedule](./development.md)
* [For more information see Chinese](./README_EN.md)
## Installation
npm:
```
npm install md-reserve
```
## Usage
###### ES6 Modules
```javascript
import {MdReverse} from 'md-reverse';

const mdReserve = new MdReverse();
mdReserve.toMarkdown(`<h1>Hello World!</h1>`);
```
###### ES5
```javascript
var MR = require('md-reverse');
var mdReserve = new MR.MdReverse();
mdReserve.toMarkdown(`<h1>Hello World!</h1>`);
```
17 changes: 17 additions & 0 deletions babel.config.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,17 @@
const presets = [
[
"@babel/preset-env",
{
// targets: {
// edge: "17",
// firefox: "60",
// chrome: "67",
// safari: "11.1",
// },
useBuiltIns: "usage",
corejs: 3
},
],
];

module.exports = { presets };
15 changes: 15 additions & 0 deletions demo.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,15 @@
// import {MdReserve} from 'md-reserve';
var MR = require('md-reverse');
// css file
import './styles.css';

const mdReserve = new MR.MdReverse();

const htmlArea = document.getElementById('html-area');
const mdArea = document.getElementById('markdown-area');

let md;
htmlArea.addEventListener('input', (ev) => {
md = mdReserve.toMarkdown(htmlArea.value);
mdArea.value = md;
});
76 changes: 76 additions & 0 deletions development.md
Original file line number Diff line number Diff line change
@@ -0,0 +1,76 @@
# 实现过程
## 词法分析
1. 接收原始HTML文本
2. 分割HTML元素:`opening tag` `enclosed text content` `closing tag`[按照HTML元素的语法](https://developer.mozilla.org/zh-CN/docs/Glossary/HTML)
分割结果为数组,按分割顺序储存。
```html
<div>
<p>Hello</p>
<p><a href="http://github.com">github</a></p>
</div>
```
**result**: `['<div>', '<p>', 'Hello', '</p>', '<p>', '<a href="http://github.com">', 'github', '</a>', '</p>']`
3. 输出结果
## 语法分析
1. 将语法分析的结果`result`转换为`Token`数组
```typescript
// Token定义
let Token: {
tag: string, // token的HTML类型
type: number, // token的数字类型
attribute: { // 元素的属性名和属性值的集合
[key: string]: string
},
position: number // Token是开标签还是闭标签(开标签: 1, 闭标签: 2, 空元素: 3)
}
```
**Token(转换结果示例如下)**:
```js
let Token = [
{
tag: 'div',
type: 1,
attribute: null,
position: 1
},
...
]
```
## 中间代码生成(Virtual DOM Tree)
1. 定义各个HTML元素类型对应的**标识符(type(numer类型)),允许的子元素类型,允许保留的属性**。
2. 接收词法分析的结果,将其转换成类似`DOM Tree`树形结构。
树节点类型如下:
```typescript
// Node定义
let Node: {
type: number, // token类型
attribute: { // 元素的属性名和属性值的集合
[key: string]: string
},
content: string | Array, // token内容,文本节点为string,非空元素如有子节点为数组,空元素为null
isHTML: boolean, // 标识是否显示为HTML内容
isCode: boolean // 标识是否显示为计算机代码
}
```
3. 遍历树,确保结构正确。
4. 输出结果
## 目标代码生成(Markdown Text)
1. 定义各个HTML元素类型对应的**开标签转换规则,闭标签转换规则**。
2. 接收中间代码。
3. 按照开闭标签的转换规则生成目标代码。
# to-do List

- [x] 词法分析
- [x] 语法分析
- [x] 中间代码生成
- [x] 目标代码生成
- [ ] 完成插件模块
- [ ] Markdown扩展语法
- [ ] Tables
- [ ] Code Syntax Highlighting
- [ ] Footnotes
- [ ] Heading IDs
- [ ] Definition Lists
- [ ] Strikethrough
- [ ] Task Lists
- [ ] Automatic URL Linking
100 changes: 100 additions & 0 deletions dist/demo.js

Large diffs are not rendered by default.

10 changes: 10 additions & 0 deletions dist/index.html
Original file line number Diff line number Diff line change
@@ -0,0 +1,10 @@
<!DOCTYPE html>
<html lang="en">
<head>
<meta charset="UTF-8">
<title>Title</title>
</head>
<body>

<script type="text/javascript" src="demo.js"></script></body>
</html>
1 change: 1 addition & 0 deletions dist/md-reserve.js

Large diffs are not rendered by default.

25 changes: 25 additions & 0 deletions es5/index.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,25 @@
"use strict";

require("core-js/modules/es.array.for-each");

require("core-js/modules/es.object.define-property");

require("core-js/modules/es.object.keys");

require("core-js/modules/web.dom-collections.for-each");

Object.defineProperty(exports, "__esModule", {
value: true
});

var _mdReverse = require("./lib/mdReverse");

Object.keys(_mdReverse).forEach(function (key) {
if (key === "default" || key === "__esModule") return;
Object.defineProperty(exports, key, {
enumerable: true,
get: function get() {
return _mdReverse[key];
}
});
});
70 changes: 70 additions & 0 deletions es5/lib/lexer.js
Original file line number Diff line number Diff line change
@@ -0,0 +1,70 @@
"use strict";

require("core-js/modules/es.array.slice");

require("core-js/modules/es.object.define-properties");

require("core-js/modules/es.object.define-property");

require("core-js/modules/es.string.trim");

Object.defineProperty(exports, "__esModule", {
value: true
});
exports.Lexer = Lexer;

var _tools = require("./tools");

function Lexer() {}

Object.defineProperties(Lexer.prototype, {
analysis: {
value: analysis
}
});

function analysis(str) {
str = _tools.Tools.trim(str);
console.time('lexical analysis');
var result = [];

var stack = [],
_char,
start = 0,
end = 0;

for (var i = 0, len = str.length; i < len; i++) {
_char = str[i];

switch (_char) {
case '<':
if (stack[stack.length - 1] !== '"') {
stack.push(_char);
start = i;
if (start - end > 1 && !_tools.Tools.isEmpty(str, end + 1, start - 1)) result.push(str.slice(end + 1, start));
}

break;

case '>':
if (stack[stack.length - 1] === '<') {
stack.pop();
end = i;
result.push(str.slice(start, end + 1));
}

break;

case '"':
if (stack[stack.length - 1] === '"') {
stack.pop();
} else if (stack[stack.length - 1] === '<') {
stack.push(_char);
}

}
}

console.timeEnd('lexical analysis');
return result;
}
Loading

0 comments on commit c51ce99

Please sign in to comment.