Reading conetent from web url
5 views (last 30 days)
Show older comments
I know how to read urls and save the content for further analyzing the data.
The issue I am facing is that I want to read certain content of a url in a specif way;
For e.g from this url https://www.gem.wiki/Almaty-2_power_station. I would like to read table 2 in a table format or tables with having specific words in it.
On exploring internet I figured out that I can read table directly from urls but I am not sure the table I want to read from the url is actual table or just text content.
Any help will be great
2 Comments
Accepted Answer
Rahul
on 30 Aug 2024
I understand that you are trying to read the content of 'Table 2' from url https://www.gem.wiki/Almaty-2_power_station .
You can achieve the desired result by following the following code:
url = 'https://www.gem.wiki/Almaty-2_power_station';
htmlContent = webread(url); % Reading the content from the url
tree = htmlTree(htmlContent);
tables = findElement(tree, "table"); % Finding the tables from the DOM tree
secondTableElement = tables(4); % Here I have tables the index as 4 as some other elemts are of the HTML page are also getting considered as tables.
% Find all rows in the second table
rows = findElement(secondTableElement, "tr");
% Initialize a cell array to store table data
tableData = {};
columnNames = {};
headerCells = findElement(rows(1), "th");
% Extract header text
for j = 1:numel(headerCells)
columnNames{j} = strtrim(extractHTMLText(headerCells(j)));
end
% Extract data rows
for i = 2:numel(rows)
cells = findElement(rows(i), "td");
% Extract text from each cell
rowData = cell(1, numel(cells));
for j = 1:numel(cells)
rowData{j} = strtrim(extractHTMLText(cells(j)));
end
tableData = [tableData; rowData];
end
% The following part is just to get a string cell array for the header
headerCellstring = cell(size(columnNames));
for i = 1:numel(columnNames)
headerCellstring{i} = columnNames{i}{1};
end
% Obtain the table using 'cell2table' function
secondTable = cell2table(tableData, 'VariableNames', headerCellstring);
You can refer to the following documentations for your reference:
'cell2table': https://www.mathworks.com/help/releases/R2024a/matlab/ref/cell2table.html?searchHighlight=cell2table&s_tid=doc_srchtitle
Hope this helps! Thanks.
More Answers (1)
See Also
Categories
Find more on Tables in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!