Code formatting in the forum
8 views (last 30 days)
Show older comments
Although this forum is online in the 3rd year now and thousands of examples can be found, it is still a tedious task to suggest beginners to format their code. The experienced contributors have explained the procedure thousands of times, and less than a hand full of the beginners found the time to thank them for this.
The problem has been mentioned exhaustively in the wish-list already. It shouldn't be complicated to solve this problem by adding explicit instructions for the first 5 times users post a question. Obviously neither the "{} code" nor the "? Help" button encourage people to learn the basics in the forum. But I'd hope that they spend the time to read text instructions like:
Formatted code is a core feature of this forum. Insert a blank line before and after the code and start each line with at least 2 spaces.
Follow the "? help" button to learn more.
And when this message disappears after the 5th posting, it could even get a red background and some flashing effects.
This would be much more efficient than letting the editors and other diligent users do this ungrateful job.
5 Comments
Evan
on 6 Aug 2013
I agree that more voting would better utilize the whole point of the "reputation" system. It seems like the community on other help-forums use voting much less sparingly, while here 0 or 1 is the most common score for even excellent answers.
Oftentimes, I notice that a user has submitted a very detailed and on-point response to someone's question and think to myself its a shame that the answer was never accepted. It's only lately that I'm realizing that, even though I'm not the OP, I actually have the ability to at least give the author some sort of feedback/credit for their effort.
Accepted Answer
Cedric
on 6 Aug 2013
Edited: Cedric
on 8 Aug 2013
EDIT @ 4:30pm EST: strfind -> regexp with neg. look behind for avoind matching nbsp;.
Here is a simple crawler. It is not my original idea, which was a mechanism at Mathworks level and not at a user (one of us) level. I implemented a few criteria which are not those listed above, as the crawler has to work with content that was already parsed and "preformatted" by the forum.
The criteria implemented should be improved. Typically, the function call(s)/def(s) detection is too "simple" and generates false positive when users write function names followed by parentheses in normal text.
Anyhow, this is just a simple demo.
The whole code below (both functions) should be saved in forumCrawler.m, and you can set pageDepth to control how many forum pages you want to process.
----------------------------------------------------------------------------------------------------------------
function forumCrawler
pageDepth = 1 ;
baseURL = 'http://www.mathworks.com' ;
for pageId = 1 : pageDepth
fprintf('\n=== Processing page %d..\n', pageId) ;
url = sprintf('%s/matlabcentral/answers/?page=%d', baseURL, pageId) ;
thread = regexp(urlread(url), '(?<=<h3><).*?(?=")', 'match') ;
nThread = length(thread) ;
for tId = 1 : nThread
fprintf(' - Analyzing thread %d/%d..\n', tId, nThread) ;
url = sprintf('%s%s', baseURL, thread{tId}) ;
htmlBuffer = urlread(url) ;
% - Scan question.
question = regexp(htmlBuffer, ...
'(?<=class="question-body ).*?(?=</div>)', 'match') ;
[tf, msg] = isLikelyUnformatted(question{1}) ;
if tf
fprintf(' [<a href="%s">question>] %s.\n', url, msg) ;
end
% - Scan answers.
answer = regexp(htmlBuffer, ...
'<div id="([^"]+)" class="answer-body">(.*?)</div>', 'tokens') ;
for cId = 1 : length(answer)
[tf, msg] = isLikelyUnformatted(answer{cId}{2}) ;
if tf
answerUrl = sprintf('%s#%s', url, answer{cId}{1}) ;
fprintf(' [<%s answer> ] %s.\n', ...
answerUrl, msg) ;
end
end
% - Scan comments.
comment = regexp(htmlBuffer, ...
'<div id="([^"]+)" class="comment-body">(.*?)</div>', 'tokens') ;
for cId = 1 : length(comment)
[tf, msg] = isLikelyUnformatted(comment{cId}{2}) ;
if tf
commentUrl = sprintf('%s#%s', url, comment{cId}{1}) ;
fprintf(' [<%s comment> ] %s.\n', ...
commentUrl, msg) ;
end
end
end
end
end
function [tf, msg] = isLikelyUnformatted(content)
tf = true ;
% Eliminate content within <pre>.. and |..| tags,
% so we work on what is meant to be text.
buffer = regexp(content, '
', 'split') ;
content = [buffer{:}] ;
buffer = regexp(content, '<tt.*?</tt>', 'split') ;
content = [buffer{:}] ;
% Check for a few indicators.
if ~isempty(regexp(content, '\w:\w', 'ONCE'))
msg = 'range def. found' ; return ; end
if ~isempty(regexp(content, '\w(', 'ONCE'))
msg = 'function call(s)/def(s) found' ; return ; end
if ~isempty(regexp(content, '(?<!nbsp);</p>', 'ONCE'))
msg = '";</p>" found' ; return ; end
tf = false ;
msg = '' ;
end
4 Comments
Evan
on 8 Aug 2013
Edited: Evan
on 8 Aug 2013
This is a really slick little function. And if something similar were implemented on TMW's end, even false positives would be pretty harmless. I think we'd still end up with people neglecting formatting (after all, nowadays popup dialogs and warning messages are either 1) meant to be ignored or 2) an exercise for honing your ability to quickly close windows). Still, it's a simple enough feature that it's worth having.
Cedric
on 8 Aug 2013
I thought about it a little more and, somehow, I wouldn't mind having automatically an intermediary page when we submit a question (not for comments or answers, but for questions only) with a big read message reminding about formatting and displaying the post as a preview. We don't post that many questions finally, so it wouldn't be annoying.
I think that this mechanism is light enough so it wouldn't take Mathworks that much time/work to implement.
More Answers (5)
Jan
on 6 Aug 2013
3 Comments
Cedric
on 6 Aug 2013
Edited: Cedric
on 6 Aug 2013
@Jan: I meant at Mathworks level, in PHP or whatever language they are using, they could implement a detection based on this list of criteria and display a warning if needed. These criteria would certainly catch most cases where there is unformatted code (and we don't need 100% accuracy), and their implementation is a matter of building a few regular expressions.
Also, this mechanism wouldn't prevent a user to submit an answer/comment, but just add a warning page which would display a red/big message warning that some unformatted code seemed to be detected and asking the user to either go back, or confirm that he/she wants to post the current content.
That said, if it presents any interest, I am probably able to build a MATLAB-based crawler which detects threads with unformatted code based on the aforementioned list of criteria, yes.
Evan
on 6 Aug 2013
Edited: Evan
on 6 Aug 2013
Is there any way to have two levels of permissions for editing another user's question? At the moment, assuming there are no users who have been granted privileges prematurely, there are only 15 users capable of editing a question. I would say 50% or less of these users have been very active on these forums over the past month or so.
I understand that editing another user's question is a privilege that has potential for abuse and should therefore be difficult to obtain, but if it were possible to split the permissions in some manner, allowing users with, say, a reputation of 750 or 1000 to use the "format code" feature without modifying the text of a question, would it be worth the effort?
Perhaps its cynical, but I think that it's going to be near impossible to get new posters to adhere to the standards for formatting. We can put up announcements, add brightly colored textboxes to the "new question" page, and even flag certain keywords, but unless we actually are making it impossible to submit a post unless you've formatted those flagged keywords, people will continue submitting giant walls of unformatted code.
And not to hijack this topic, but another feature I would like to see is the ability to move comments and answers for those cases where users don't catch on to the differences between them.
2 Comments
Cedric
on 6 Aug 2013
Edited: Cedric
on 6 Aug 2013
This is related to the "janitor" type of work mentioned here in my answer at the bottom (copied below).
" " "
I've seen Walter mentioning "janitor" type of work on the forum, and I think that a 500 rep. should allow people to do this kind of work actually, if they have time and energy for this (and if they are trusted; I'll develop this below). It is obviously tricky to give enough privileges to perform janitor work without giving all privileges, but it is certainly worth working on finding a solution.. in the sense that currently you have to be a high rep. member to spend your time on e.g. formatting questions instead of answering them (..).
Jan posted lately a question about formatting and I commented mentioning "trustees". I think that it is meaningful in the sense that active people in the top 10 rep. know roughly who is answering questions and have an idea about the quality of the answers; in other words, I think that privileges would be better distributed by a mechanism involving rep. points but more importantly a sponsorship/trustee mechanism involving these top 10 rep. active people.
Mixing this idea and the "janitor" type of work mentioned above, I believe that it would be quite interesting if members hitting 500 rep. points, and defined as trustees by top 10 members, would get a limited privilege for editing questions (maybe more interesting than giving a privilege for accepting answers). To illustrate, a logic could be:
- Rep. points provide recognition as they should, but no privilege. These are separate aspects of the "life" on the forum.
- Rep. points + the sponsorship/trustee flag provide privileges. E.g. 500 pts + trustee provide "janitor type of work" privileges. People with these privileges are thought to be able to know when/where they are proficient enough to accept answers, and hence have the privilege to accept answers. They can also edit questions without having the full editor privilege, which could be defined as: adding/deleting spaces, underscore, stars, and CR/LF. This would allow performing most of the formatting tasks, without leaving the possibility to change the content (addressing hence Jan's concern in his post mentioned above). It would be relatively easy to implement the check: after removal of these characters in both the original and the modified text, the strings must match.
" " "
See Also
Categories
Find more on Software Development Tools in Help Center and File Exchange
Community Treasure Hunt
Find the treasures in MATLAB Central and discover how the community can help you!
Start Hunting!